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Research  Overview 


1 . 


This  document  presents  results  of  research  over  the  past  six 
months  at  the  DSC  Image  Processing  Institute.  This  report  is  divided 
in  two  major  areas:  image  understanding  projects,  and  smart  sensor 
projects.  These  areas  are  abstracted  below. 

Image  Understand  ing  Projects 

The  image  understanding  research  has  been  directed  at  the  various 
levels  of  an  image  understanding  system.  In  the  first  section, 
Abramatic  and  Lee  describe  further  developments  in  techniques  of  rapid 
convolution  of  images  using  small  generating  kernels. 

A  major  area  of  research  in  the  last  six  months  has  been  texture 
analysis  and  this  report  contains  four  sections  related  to  this  topic. 
Laws  describes  texture  energy  measures  using  smell  (3x3,  5x5)  filter 
masks.  The  classification  results  appear  to  be  better  than  for  the 
more  expensive  gray  level  co-occurence  computation  methods. 
Vilnrotter,  Nevatia  and  Price  present  new  resuls  in  automatically 
generating  structural  texture  descriptions  for  natural  textures. 
Garber  describes  a  mathematical  model  for  textures  and  its  use  in 
texture  analysis  as  well  as  synthesis.  Ashjari  present  a  theoretical 
analysis  of  the  probability  density  function  of  the  singular  value  of 
a  random  texture  field.  These  singular  values  are  believed  to  be 
useful  texture  discriminants. 

In  work  at  higher  levels  of  our  image  understanding  system,  Babu 
describes  an  approach  to  the  recognition  of  structured  objects  such  as 
airports.  Price  presents  new  results  in  matching  map  models  to  aerial 
images  of  Stockton  Ca .  and  San  Diego,  Ca .  areas. 

Finally,  Lo  and  Sawchuk  present  a  comparison  between  a  maximum  a 
poster  ior i  (MAP)  non-linear  filter  developed  by  them  and  a 
conventional  linear  minimum-mean-square  error  filter  for  restoration 


of  images  degraded  by  Poisson  noise. 

Smart  Censor  Pt£J£i"£f 

The  Hughes  Research  Laboratories  present  a  section  describing 
research  progress  on  the  develop>ment  of  smart  sensors  for  image 
processing.  Hughes  has  completed  the  construction  of  a  new  CCD  chip 
and  is  currently  testing  and  evaluating  its  performance.  This  chip  is 
designed  to  perform  the  following  functions: 

3x3  Laplacian 

5x5  med  lan  filter 

5x5  programmable  weight  convolver 

7x7  bipolar  convolver 

2bx2t>  edge  detection  convolver  . 


2. 


I  mage  yD^£‘I“tirD22D9 


2.1  Par  a  1  lei -Cascade 

Convol ut ion 

of 

Images : 

F i xed-Po i n  t 

Implementat ion 

J.F.  Abramatic  and 

Sang  Uk  Lee 

Introduction 


Convolution  of  images  has  proved  to  be  very  useful  in  many  areas 
including  image  enhancement,  image  restoration  and  edge  detection. 
However,  whenever  the  size  of  convolution  kernel  is  large,  the 
processing  can  be  very  costly  in  time  as  well  as  in  memory 
requirements.  Among  the  recent  techniques  for  alleviating  this 
problem  is  the  idea  of  sequentially  convolving  the  image  with  small 
size  kernels  [1].  Our  work  in  SVD/SGK  filter  follows  this  new 
approach. 


SVD/SGK  Convolution  Filters 

Using  Singular  Value  Decomposition  (SVD)  technique,  we  can 
decompose  any  LxL  impulse  response  H  as  the  sum  of  R  separable  2-D 
filters  [ 2 ] . 


R 


H-  E 


i=l 


ip  .  a  .  bfc 

l—i—i 


(1) 


where  R  is  the  rank  of  H,  ip .  ,  i~l,...R  are  the  singular  values  of  H, 

1  T  — 

which  are  the  square  roots  of  the  eigenvalues  of  H  H,  {a,}  ,  (b-  }  are 

T  T  1 

the  eigenvalues  of  H  H  and  HH  ,  respectively.  Whenever  R  is  less  than 

L,  the  number  of  operations  needed  to  perform  the  convolution  can  be 


reduced.  Also,  when  •,  becomes  small,  the  decomposition 
truncated  without  introducing  significant  errors  in  the 
filtering  process. 


can  be 
over  all 


In  general,  we  will  be  satisfied  with  a  multistage  expansion  that 
will  closely  approximate  H.  In  this  case 


K 


H  =  Z_j  U’-a 

i  =  l 


_  bfc 
1-1-1 


where  K  is  the  number  of  retained  terms  (K  <  R) .  In  many  cases,  the 
impulse  response  H  can  be  approximated  with  a  small  error  by  only 
retaining  the  first  few  dominant  terms  of  Fq .  (1).  This  suggests  a 
parallel  implementation  of  the  2-D  filter  described  in  Fig.  1. 
Moreover,  each  of  the  filters  H ^ ,  where  =  y^Sjb^,  can  be  realized 
by  cascading  3x1  SGK  filters,  oriented  either  in  the  column  or  in  the 
row  direction.  Figure  2  shows  the  basic  implementation  structure  of 
SVD./SGK  convolution  filter. 


F  X£5lP9i  Dl  Finite-Word  Length  Ar  ithmet  ic  Implementation  of  SVD  /SGK 
F  i_l  t  er  s 

To  take  advantage  of  the  SVC  SGK  filters,  one  will  often  have  to 
implement  them  in  special-purpose  hardware.  Image  processing  display 
is  one  example  where  such  filters  may  be  used.  They  will  perform 
convolution  without  needing  the  capabilities  of  a  general  purpose  host 
computer.  In  such  case,  the  use  of  fixed-point  finite-word  length 
arithmetic  will  be  tequited  if  one  wants  to  have  "real-time" 
capabilities.  ["Real-time"  will  mean  here  to  be  able  to  perform 
convolutions  between  two  frame  displays.  The  unit  of  time  will 
basically  be  1,30  seccndl.  Using  fixed-point  arithmetic  will 
introduce  errors  in  the  computations.  We  can  classify  these  errors 
into  three  types:  a)  filter  coefficients  quantization  error,  b) 
roundoff  error,  c)  overflow  error.  The  quantization  errors  occur 
because  the  coefficients  of  the  filters  will  not  take  the  exact  value 


they  were  assigned  by  the  design  procedure.  This  is  due  to  the  finite 
word  length  used  to  represent  these  numbers.  Experimental  results 
have  proved  that  the  amount  of  errors  of  this  type  is  not  significant 
in  our  case.  Roundoff  errors  occur  every  time  a  multiplication  is 
performed.  Multiplying  an  n  bits  (including  sign  bit)  number  by  an  m 
bits  number  will  result  in  an  n+m-1  bits  number.  Generally,  only  p 
bits  (p  <  n+m-1)  will  be  available  to  store  the  result  of 
multiplication.  Rounding/truncation  operation  will  be  performed 
introducing  the  so-called  roundoff  error.  Finally,  the  use  of  fixed 
point  arithmetic  will  constrain  the  numbers  to  belong  to  a  given 
dynamic  range  say  -1  to  +1.  Performing  an  addition  between  two 
numbers  may  result  in  a  number  which  does  not  belong  to  this  interval. 
The  truncation  operation  that  will  assign  the  limit  value  to  the 
result  (say  -1  or  +1)  will  introduce  a  so-called  overflow  error. 

To  minimize  the  discussed  errors,  we  have  two  kinds  of  degrees  of 
freedom:  a)  every  subfilter  of  the  cascade  structure  can  be  scaled  by 
any  factor  as  long  as  the  overall  gain  of  the  filter  is  kept  to  its 
actual  value,  b)  the  order  with  which  the  elements  of  the  cascade  are 
used,  can  be  chosen  as  we  wish.  We  will  have  thus  to  derive  a  scaling 
procedure  that  will  deal  with  the  overflow  problem  and  an  ordering 
procedure  for  minimizing  the  effect  of  roundoff  errors. 

Procedure 

Let  us  consider  the  case  where  a  ( 2Q+1 ) x ( 2Q+1 )  filter  has  been 
decomposed  by  means  of  the  SVD/SGK  structure.  We  will  be  concerned 
here  by  one  SVD  expansion  that  implements  the  cascade  convolution  of  0 
1-D  column  operation  filters  and  0  1-D  row  operation  filters,  all 
of  them  of  size  3x1,  The  only  information  we  have  on  the  input  is 
that  it  has  been  normalized  to  the  range  [0,1].  We  will  use  the 
sum-scaling  procedure  to  prevent  overflow  errors.  (See  Gold-Rabiner 
[3]).  If  h1(k,£)  denotes  the  impulse  response  of  the  system  from  the 
input  to  the  ith  subfilter  output,  h1(k,5-)  is  given  by 

h1(k,«.)  =  Si  [g1  (k,  £)  **g2  (k,  i)  **  .  .  .  **g1  (k,  l)  ]  (2) 
where  g 1  ( k  ,  JL )  is  defined  to  be 
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0 


g  (k,s.)  = 


c 


0  0 


0  0 


0 

0 

0 


r 


.  ,  . th  .  .  . 

if  i  section  is  column 

or i on ted  f  i 1  tor 


(  1) 


.  t  h 

if  i  section  is  row 
oriented  1  i 1  ter 


is  a  scaling  factor.  Using  the  sum- scaling  procedure,  we  will 
insure  that  no  overflow  occurs  by  imposing 


y  ]  I  h 1  ( k ,  e. )  I  1  for  i  1,2,...,  20 


(4) 


As  each  of  the  subfilter  filters  are  either  row  or  column  operation 
subfilter,  h  ( k ,  C.)  can  be  rewritten  as 


h 1  ( k ,  e )  Sigjuoq^tn 


(5) 


where  g^fk)  and  q ^ (  e )  arc  the  impulse  responses  obtained  by  gathering 
the  j  column  and  the  (i-j)  row  operation  subfilter,  respectively. 
Equation  (4)  can  then  be  rewritten  as 


1 

£  In'  (k )  iZh'u)  ; 

k  1  t  K 

The  scaling  factor  used  fot  the  i th  section  is  finally  given  by 
♦‘denotes  the  2-D  convolution  while  ♦  denotes  the  1-0  convolution. 
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a . 


(7) 


c* 

l-  ] 

One  can  see  then  that  this  factor  will  be  the  same  as  the  one  that  one 
would  have  used  cascading  the  j  column  or  the  (i-j)  rows  only.  To 
prove  that,  let  us  consider  that  the  ith  section  is  a  column  oriented 
filter.  In  that  case  o.  is  given  by 


H  lo*  1 (k) I 

k 


(8) 


The  terms  corresponding  to  the  row  convolutions  in  and  S ■ _  j  will  be 
identical.  This  result  will  lighten  the  burden  of  calculating  the 
scaling  factors  as  the  overall  2-D  scaling  procedure  reduces  to  two 
1-D  scaling  procedures.  We  will  denote  Y 1  ( k  ,  ? )  the  scaled  impulse 
response  of  the  ith  section. 

of  the  Round-off  Errors 


When  an  n+m-1  bits  number  is  rounded  to  an  p  bits  number,  the 


error  introduced  can  be  assumed  to  be  the  realization  of 
variable  uniformly  distributed  between  -2  ^ 1  *  *  *  and  2  ^  . 


a  random 
A  more 


restrictive  assumption  will  be  to  consider  that  all  the  roundoff 
etrors  occurring  as  a  result  of  rounding/truncation  are  independent 


random  quantities.  Figure  3  describes  one  stage  FVD  expanded  filter 
roundoff  noise  model 
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Figure  3.  SVD/SGK  Filter  Roundoff  Noise  Model. 


Fiqure  4.  Prospective  View  of  Prototype  Filter 
Frequency  Response. 


i 


whe  t  e 


a 


2 

k 


2-2P 

TT~ 


and 
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fi (k, e) 


i  +  1 

'  (k,  l) 


*  * 


...**> 


20 

(k,  c) 


(11) 


f^(k,i)  is  a  separable  impulse  response  and  thus  the  error  can  be 
rewritten  as 


20 

f  =  °R  2  (  Tj  |ftl(k)  I2  E  |fj(C)  I2)  (12) 

t-1  k  K 

This  formulation  will  be  useful  to  generalize  to  the  2-D  separable 
case  the  ordering  algorithm  proposed  by  Chan  and  Rabiner  [4). 

Dealing  with  a  { 2Q+1 ) x ( 2Q+ 1 )  kernel,  gives  2Q  subfilters  of  size 
of  3x1  either  column  or  row  operators.  We  thus  have  (20)!  possible 
orderings.  As  in  the  1-D  case  testing  all  the  possibilities  will 
become  almost  impossible  as  0  grows,  but  good  sub-optimal  procedures 
will  prove  to  be  efficient.  This  is  the  reason  why  we  will  generalize 
the  2-D  separable  case  with  the  Chan-Rabi ner ' s  1-D  sub-optimal 
algor  i t hm . 

This  algorithm  consists  in  building  the  cascade  structure  section 
after  section  starting  from  the  end,  minimizing  at  each  step  the  new 
term  introduced  in  E’q.  (12).  Following  this  approach,  we  will  show 
that  the  resulting  ordering  of  the  rows  (resp.  the  columns)  will  be 
the  same  as  the  one  obtained  by  ordering  the  rows  (resp.  the  columns) 
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as  if  we  were  solving  two  1-D  problems.  The  algorithm  will  provide 
then  a  good  way  of  ordering  the  rows  "among"  the  columns.  To  show 
that  result,  let  us  assume  that  we  have  already  chosen  Q- i  sections,  j 
of  them  ate  columns,  Q-i-j  are  rows.  We  want  to  determine  the  ith 
section  of  the  cascade  structure.  Let  us  define  C^,  R  as 


■i-  5 


f*+1(k)  |2 


*i  ’  ?  '4+1 


If  we  choose  a  column  as  the  ith  section,  the  new  term  in  Eq .  (12) 
will  be 


Ri  -  £  |fi(k)  |2 

k  c 


if  it  is  a  row  we  will  have 


e?  =  c.  £  | f im 


It  is  clear  now  that  to  get  the  smallest  E.,  one  has  only  to  compare 
the  results  given  by  the  row  and  the  column  that  would  have  been 
chosen  in  the  1-D  algorithm  orderings. 
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2lfi§£lnS  Algorithm 

The  Ordering  algorithm  for  one  2-D  separable  channel  goes  as 
follows . 

i)  Run  the  1-D  Chan-Rabiner  ordering  algorithm  for  the  rows  and 
for  the  columns  separately.  (The  calculation  of  the  scaling 
factors  is  included  in  this  step.) 

ii)  For  i  going  from  2Q  to  1  ;  compare  E  and  E  ^  given  by 
Eqs.  (15),  (16)  to  decide  whether  a  row  or  a  column  should  become 
the  ith  section. 

This  algorithm  has  been  tested  by  using  as  inputs  either  2-D  white 
noise  or  real  images.  The  results  have  been  compared  to  other  "common 
sense"  orderings.  In  all  cases  the  ordering  provided  by  the  algorithm 
has  proved  to  be  the  best. 

Exper iments 

We  have  chosen  to  use  a  zero-phase  bandpass  filter  with  a  size  of 
11x11  as  a  prototype  filter.  The  frequency  response  of  the  prototype 
filter  is  shown  in  Fig.  4.  The  prototype  filter  was  expanded  using 
the  SVD  technique  and  approximated  by  retaining  only  the  4  largest 
terms  in  Eq.  (1).  The  resulting  approximating  filter  yields  0.92% 
normalized  mean  square  error  (NMSE ) .  Several  experiments  have  been 
performed  to  evaluate  the  SVD/SGK  convolution  filter  with  fixed-point 
arithmetic.  The  original  image  is  shown  in  Fig.  5a  and  the  result  of 
convolution  with  a  11x11  prototype  filter  with  a  floating  point 
arithmetic  is  shown  in  Fig.  5b.  The  result  of  convolution  with 
SVD/SGK  filter  with  a  fixed-point  arithmetic  is  presented  in  Fig.  5c. 
In  this  experiment,  we  assigned  12  bits  (including  sign  bit) 
word-length  for  memory  and  16  bits  for  filter  coefficient 
quantization.  The  resulting  NMSE  between  the  prototype  processed 
output  of  Fig.  5b  and  the  SVD/SGK  processed  output  with  fixed-point 
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(c)  SVD/SGK  Convolution 


Figure  5.  Example  of  Image  Convolution  with  Prototype 
Filter  and  SVD/SGK  Filter 


arithmetic  is  0.6742*.  We  show  also  the  magnitude  of  absolute 
difference  between  Fiqs.  5b  and  5c  as  quide  to  perceive  the  errors. 
The  dilference  imaqe  of  Fiq.  5d  was  multiplied  by  scalinq  factor  150. 
To  give  a  practical  idea  of  how  useful  is  the  SVD  expansion  of  H, 
Figs.  6  and  7  show  the  convolution  output  and  difference  image 
corresponding  to  different  deqree  of  approximation.  The  usefulness  of 
SVD  expansion  can  be  demonstrated  by  noting  that  we  can  tradeoff 
between  the  amount  of  approx imat ion  error  and  the  comput at iona 1 
efficiency  by  selecting  the  number  of  terms  in  the  SVD  expansion  of  H. 

Summary 

The  singular  value  decomposition  of  a  2-D  finite  impulse  response 
has  proved  to  be  useful  in  designing  2-D  approx imat i ng  FIR  filters 
that  can  be  implemented  by  means  of  a  pa r a  1 1  el  -  cascade  structure.  We 
have  shown  how  the  algorithms  available  in  the  domain  of  1-D  signal 
processing  for  implementing  cascade  filters  with  fixed  point 
finite-word  length  arithmetic,  can  be  extended  to  2-D  signal 
processing.  It  has  been  seen  that  taking  advantage  of  the  2-D 
structure  by  interlacing  column  and  row  convolutions  sequentially,  can 
be  useful  in  minimizing  the  roundoff  errors.  Finally,  experiments  on 
real  images  have  proved  that  the  SVD/SGK  filters  can  be  useful  in  the 
domain  of  "real-time"  image  convolution. 
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(b)  2  term  SVD/SGK 


(a)  1  term  SVD/SGk 


(c)  3  term  SVD/SGK 


Example  of  Image  Convolution  Using  SVD/SGK  Filter 
with  Different  Degree  of  Approximation  of  H 


(b)  Seal 


Figure  7.  Absolute  Difference  Image  Between  Fiu.  5b  and  Fig 
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2.2  Texture  Energy  Measures 
Kenneth  I .  Laws 


This  paper  presents  a  set  of  "texture  energy"  transforms  that 
provide  texture  measures  for  each  pixel  of  a  monochrome  image.  The 
transforms  are  fast,  requiring  only  one-d imens iona 1  convolutions  and 
simple  moving-average  techniques.  The  method  provides  more  accurate 
classification  than  gray  level  co-occurrence  methods.  It  is  local, 
operating  on  small  image  windows  in  much  the  same  manner  as  the  human 
visual  system.  It  can  be  made  invariant  to  changes  in  luminance, 
contrast,  and  rotation  without  histogram  equalization  or  other 
preprocessing  . 

Figure  1  shows  the  sequence  of  images,  or  image  blocks,  used  in 
measuring  texture.  The  original  image  is  first  filtered  with  a  set  of 
small  convolution  masks  with  fixed,  integer  coefficients.  These  masks 
are  separable,  so  that  only  one-dimensional  convolution  is  required. 
The  filtering  can  also  be  accomplished  with  two-stage  3x3  or  1x3  and 
3x1  convolutions.  It  could  even  be  done  optically. 

The  filtered  images  are  then  processed  with  a  nonlinear  "local 
texture  energy"  filter.  This  is  simply  a  moving-window  average  of  the 


absolute  image  values.  Such  mov i nq-wi ndow  operations  ore  vriy  fast 
even  on  qenet.il  purpose  diqital  computers.  Fnch  pixel  is  examined 
only  twice,  making  the  transform  time  independent  of  window  size.  The 
best  window  size  depends  on  the  size  of  image  texture  regions.  This 
:  tudy  has  concentrated  on  lr>xlr>  windows.  Fven  smallei  windows  could 
b<  used  it  colot  infotmat  ion  were  available.  The  numbei  of  rows  which 
mu;  t  be  ptocessed  at  one  t  ime  is  equal  to  the  window  size. 

Figure  1  shows  a  one-to-one  mapping  between  filtered  imager  and 
1 1  xt  in  i  unrigy  planes.  in  previous  reseat  cli  wc  used  twelve  measures, 
b.  :  c>:  |'i  iin.ii  ily  on  the  tout  stat  ist  ical  moments  suggested  by  Fauqet  as 
and  Ft at t  .  Fxpet  ience  has  shown  that  either  vat iance  or  standard 
deviation  alone  is  sufficient  to  extiact  texture  infotmat  ion  (tom  the 
t  i  l  1 1  i ed  images. 

Vai  iance  is  an  avei  age  squared  deviation  t  i  om  t  hi'  mean.  Foi  a 
zero  mean  field,  as  produced  by  convol ut  ion  with  a  zero-sum  mask, 
vai iance  is  the  avetage  of  squared  signal  values.  It  is  thus  an 
energy  measure,  in  the  formal  sense  of  the  word.  Macro-window 

vai  ranee  measures  the  total  energy  within  a  window.  It  the  image  has- 
been  tiltered,  it  measures  the  local  energy  within  the  pass  band.  The 
SF>V  macio-stat  ist  ie  is  the  square  root  of  this  local  energy.  It  may 
be  considered  a  "texture  energy"  measure.  A  fastet  energy  transform, 
the  ARRAVF  statistic,  will  be  introduced  shoitly. 

These  statist  ics  diftei  ttom  previously  studied  f  t  oqueney-donie  i  n 
text  me  measui  es  because  ot  theii  local  net  me.  Fi  eouency  components 

aie  measured  with  vei y  small  convol ut  ion  masks.  Fach  micto  window  is 

t i cat ed  independently,  without  regard  to  its  phase  relationships  with 
othi't  micto  windows.  This  method,  similet  to  human  visual  processing, 
is  apptopt  rate  tot  textures  with  a  short  coherence  length  or 

cot  i  el.it  ion  distance.  It  is  less  poweitul  than  Foui  iet  methods  lot 
man-made  textures  with  inherent  synch t on i za t i on  ot  texture  clement 
spac i ng s . 
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The  final  step  in  Figure  1  shows  the  lineal  combination  of  the 
t  extuie  energy  planes  into  a  smallci  numbei  of  pt  ineiple  component 
planes,  typically  thteo.  This  is  an  optional  data  compression  step. 
It  is  tempting  to  call  the  final  images  "perceptual  planes,"  but  it 
has  not  yet  been  proven  that  they  i elate  to  human  text  ut  e  percept  ion. 
They  do  stem  to  represent  natural  texture  dimensions,  and  to  be  more 
"reliable"  than  the  texture  energy  planes. 

The  final  output  can  also  be  a  segmented  image.  A  classifiei 
assigning  texture  labels  to  the  image  pixels  can  take  either  text  ui e 
energy  planes  oi  pi  ineiple  component  planes  as  input  .  Classif  ic.it  ion 
is  simple  and  tast  it  texture  classes  are  known  a  pt  ioi  i.  Cl  list  et  oi 
segment  at  ion  algot  ithms  must  be  used  it  texture  classes  ate  unknown. 

Tr Xt  ut  e  Pa t  a 

In  an  expet  imontal  study,  the  results  can  be  no  bet  tot  than  the 
input  data.  We  requite  a  set  ot  uniform  texture  fields  large  enough 
to  provide  adequate  samples  ot  each  textute.  Ideally  this  1 1  a  i  n  i  tig 
set  should  come  t t om  a  tatget  applieat ion  area.  Fot  a  general  vision 
system,  each  textute  must  be  a  "net  lit  al"  one,  and  the  set  must  include 
a  tango  ot  natural  textute  dimensions.  We  avoid  aititicially 
generated  textutes,  such  as  sinusoidal  qtat ings,  because  they  would 
tavot  the  Font  i<  t  ttanstoim  and  otliet  frequency  domain  measures. 

'Ihe  text  utis  we  have  chosen  are  Item  high- 1  esol  ut  ion  photographs 
used  in  the  Brodatz  texture  album.  Figute  2a  shows  a  composite.  The 
fitst  two  tows  ot  1  28x1 2  M  blocks  are  from  the  images  ot  Crass,  Ke t  t i a  , 
Sand,  Wool,  Pigskin,  T.eathet  ,  Water,  and  Wood.  The  lowei-lett 

quadrant  is-  composed  ot  i.’x  t,’  blocks,  and  the  lowet-t  ight  guadtant  ot 
l()xlti  blocks.  The  128x1211  blocks  have  been  individually  histogram 
equal  ized,  the  othet  blocks  have  been  equal  i zed  by  guadtant  . 

Histogram  equalization  has  been  used  to  t  emove  all  fitst  -ordet 
differences.  This  also  finesses  the  problem  ot  whethci  to  measm  e 
luminance  ot  density,  since  t  he  equal  izat  ion  gives-  the  same 

21 


l  mag « 


(b)  First  Component 


(a)  Composite 


(d)  Classification 


(c)  Second  Component 


Figure  2.  Texture  Images 
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result  for  either.  In  our  discriminant  analyses  a  more  rigorous 
adaptive  equalization  has  also  been  used:  it  seems  to  give  more 
conservative  and  reliable  results. 

The  textures  were  chosen  precisely  because  they  are  difficult  to 
discriminate.  They  are  a  worst  case  dataset.  Grass  and  Sand  are  very 
similar,  with  the  main  difference  being  the  extended  edges  in  grass. 
Pigskin,  Raffia,  and  Sand  may  be  considered  cellular  textures  with 
similar  cell  sizes.  Ra f f i a  is  distinguished  by  its  long-range 
structure,  and  Wool  by  its  fiber  content  and  lack  of  coarse  edge 
structure.  The  Grass,  Leather,  Wood,  and  Water  images  all  have 
vertical  structure. 

Comp a r  1  son  Statistics 

Co-occurrence  matrices  are  a  popular  source  of  texture  features. 
We  have  generated  co-occurrence  matrices  from  15x15  source  windows 
requantized  to  32  gray  levels.  Each  matrix  is  thus  32x32.  Nine  of 
these  matrices  are  used,  corresponding  to  horizontal  and  vertical 
spacings  of  zero,  one,  and  seven  pixels.  The  chosen  spacinqs 
correspond  to  horizontal,  vertical,  and  top  left  to  bottom  right 
diagonal  directions.  The  POO  matrix  records  first-order  information: 
all  the  entries  are  on  the  diagonal.  The  other  eight  matrices  record 
second-order  information.  The  matrices  are  not  symmetric,  nor  is 
there  any  averaging  across  different  co-occurrence  angles. 

Table  1  shows  the  classification  accuracies  available  with 
various  feature  sets.  The  first  analysis  uses  only  the  ARM ,  CON,  COR, 
I  DM ,  and  ENT  Haralick  moments  [1],  12].  Together  the  32  features  give 
almost  68%  classification  accuracy  on  the  adaptively  equalized  texture 
set.  The  globally  equalized  textures  generate  two  dominant 
discriminant  functions  using  PIOCON,  P01TDM,  P70IDM,  P11CON,  POICON, 
P10IDM,  PIOCOR,  and  P11COR.  Discriminant  functions  for  adaptively 
equalized  textures  use  PIOCON,  P01IDM,  P70CON,  P11CON,  POICON,  and 
P71C0R.  The  angular  second  moment,  correlation,  and  entropy  features 
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apparently  carry  little  texture  information. 


Table  1.  Co-occurrence  Classification  Accuracy 


Feature  Set 

Global 

Adapt i ve 

Haral ick  Moments 

70.85 

67 . 58 

Rectilinear  Moments 

63.04 

65.92 

Diagonal  Moments 

56.60 

63.04 

Combined  Moments 

72.07 

68.16 

The  second  and  third  analyses  in  Table  1  use  rectilinear  and 
diagonal  moments,  respectively.  These  moments  will  not  be  described 
here.  Neither  set  is  as  powerful  as  the  Haralick  moments.  The  fourth 
analysis  combines  all  of  the  co-occurrence  features,  a  total  of  172 
independent  texture  measures  for  the  nine  co-occurrence  matrices. 
Classification  accuracy  improves  very  little,  and  the  variables 
selected  by  the  discriminant  analysis  are  nearly  all  from  the  Haralick 
set . 

Macro-Statistic  Selection 


Table  2  shows  the  classification  accuracies  possible  with  local 
texture  energy  measures.  The  classifier  used  50  feature  planes  per 
texture,  each  produced  by  convolution  with  a  3-vector,  5-vector,  3x3 
matrix,  or  5x5  matrix.  The  SDV  statistics  were  gathered  over  15x15 
macro  windows.  Classification  accuracies  are  much  higher  than  the  72% 
achieved  with  co-occur rence  statistics. 

Energy  and  variance  are  both  defined  as  sums  of  squares  because 
such  sums  are  analytically  tractable.  The  physical  world  is  under  no 
constraint  to  be  tractable.  It  is  probable  that  the  human  visual 
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system  avoids  root-mean-square  computations,  and  quite  possible  that 
simpler  statistics  are  more  appropriate  for  texture  analysis. 

Table  2.  Macro-Statistic  Classification  Accuracies 


Feature  Set 

Global 

Adapt i ve 

SDV 

85.99 

85.60 

ABSAVE 

88.09 

87.11 

SDV+ABSAVE 

89.16 

87 .55 

Table  2  also  presents  an  alternative  to  the  standard  deviation. 
The  ABSAVE  statistic  is  computed  as  the  average  absolute  value  within 
a  macro  window.  For  a  zero  mean  field  it  may  be  considered  a  fast 
approximation  to  the  standard  deviation.  It  performs  poorly  only  with 
operators  which  are  not  zero-sum.  For  this  dataset,  ABSAVE  features 
are  jointly  more  powerful  than  SDV  features,  and  nearly  as  powerful  as 
both  sets  together. 

The  SDV  and  ABSAVE  macro-statistics  share  a  common  weakness. 
Neither  can  distinguish  between  a  dark  field  with  bright  spots  and  a 
bright  field  with  dark  spots.  In  statistical  terms,  the  two  fields 
differ  in  skewness.  In  frequency  terms,  they  differ  in  phase  rather 
than  energy.  One  solution  is  to  average  positive  values  instead  of 
absolute  values.  It  is  reasonable  that  neurons  in  the  retina  might 
perform  such  a  clipping  function.  Such  one-sided  measures  perform 
slightly  less  well  than  the  SDV  and  ABSAVE  measures,  although  much 
better  than  co-occurrence  statistics.  Another  possibility  is  the 
measurement  of  local  phase  relationships,  as  with  moving-window 
Fourier  transforms.  For  the  present  we  accept  the  ABSAVE  statistic, 
keeping  in  mind  that  there  will  be  some  textures  we  cannot 
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disci  inunjte. 


ABSAVF  features  are  preferred  to  SDV  features  only 
because  of  their  comput at ional  simplicity.  Both  appear  to  be 
equivalent  measures  of  texture  energy. 

Cent e r - We i gh t  ed  Filter  Masks 

Figure  3  shows  three  sets  of  one-d  imens iona 1  convolution  masks. 
The  names  are  mnemonics  for  Level,  Fdge,  Spot,  Wave,  Ripple, 
Undulation,  and  Oscillation.  The  vectors  in  each  set  are  ordered  by 
sequency.  The  vectors  are  weighted  toward  the  center,  all  are 
symmetric  or  ant i-symmet r ic ,  and  all  but  the  Level  vectors  are 
zero-sum.  The  vectors  in  each  set  are  independent,  but  not 
o  r  t  hogona 1 . 

The  1x3  vectors  foim  a  basis  for  the  larger  vector  sets.  Each 
1x5  vector  may  be  generated  by  convolving  two  1x3  vectors.  S5,  for 
instance,  can  be  generated  as  L 3 ( * ) S 3 ,  S3(*)L3,  or  F3(*)F3.  The  1x7 
vectors  can  be  generated  by  convolving  1x3  and  1x5  vectors,  or  by 
twice  convolving  1x3  vectors.  The  sequency  of  a  generated  vector  is 
the  sum  of  the  component  sequencies. 

We  have  investigated  the  discriminant  power  of  one-dimensional 
masks.  Rotat ion- invat  iant  filters,  such  as  the  Sobel  gradient 
magnitude,  are  only  fair  as  texture  measures.  Better  results  are 
obtained  by  using  directional  masks  separately  and  then  combining  the 
texture  energy  measures.  We  have  applied  horizontal  and  vertical 
masks  in  pairs,  although  the  discriminant  analyses  have  not  been 
constrained  to  assign  equal  weights. 

The  six  3-vectors  alone  perform  slightly  better  than  the 
elaborate  co-occurrence  features.  This  is  amazing  considering  the 
simplicity  of  the  texture  energy  method  and  the  many  experimental 
vindications  of  Haralick's  co-occur rence  statistics.  The  5-vector 
statistics  perform  even  better,  achieving  82*  c 1  ass  i  f i ca t ion  accuracy. 
Using  7-vectors  or  combining  more  than  one  vector  size  gives  no 
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Figure  3.  Center-Weighted  Vector  Masks 
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significant  impt ovemcnt . 

Neurological  studies  show  that  the  visual  cortex  computes  edqe 
measures  in  approximately  ten-degree  increments.  We  have  investigated 
diagonal  one-dimensional  features,  although  they  are  not  properly 
members  of  the  separable  feature  sets.  Inclusion  of  diagonal  features 
improves  classification  accuracies  significantly.  The  5-vector 
statistics  alone  are  sufficient  to  achieve  86*  c 1  ass i f i ca t ion 
accuracy,  close  to  the  maximum  reached  in  this  study.  The  combined 
feature  sets  have  little  more  power,  but  provide  insight  into  the 
selection  process.  The  discriminant  functions  are  based  on  vectors  oi 
all  directions  and  sizes.  Different  subsets  are  selected  in  the 
globally  equalized  and  adaptively  equalized  cases.  Yet  all  ot  the 
selected  features  are  either  Edge  statistics  or  the  symmetric  ?pot  , 
Ripple,  and  Oscillation  statistics.  None  of  the  ant i - symmet r i c  Wave 
or  Undulation  features  were  found  useful. 

Figure  4  shows  the  nine  masks  generated  by  convolving  a  vertical 
3-vector  with  a  horizontal  3-vector.  This  may  be  considered  a 
cross-product  or  vector  multiplication  operation,  but  convolution  has 
special  significance  here.  We  shall  extract  texture  information  from 
image  data  by  convolving  with  the  3x3  masks.  Convolution  with  the 
component  one-dimensional  masks  gives  exactly  the  same  result  as 
convolution  with  the  separable  3x3  mask. 

The  nine  independent  3x3  masks  form  a  complete  set.  Any  3x'' 
matrix  can  be  expressed  as  a  unique  linear  combination  of  the  masks. 
The  5x5  masks  and  7x7  masks  also  form  complete  sets,  with  even 
stronger  weighting  towatd  the  center.  The  separable  structure  of 
these  masks  makes  it  feasible  to  apply  them  as  spa t i a  1 -doma i n  filters. 
A  5x5  convolution,  fot  instance,  can  be  implemented  as  two  3x3 
convolutions,  a  5x1  and  a  1x5  convolution,  or  two  3x1  and  two  lx1 
convol ut ions . 

The  two-dimensional  masks  are  even  more  powerful  than  the  tested 
sets  of  one-d imens  iona 1  masks.  Again  the  length  five  masks  are  best, 
although  the  evidence  is  less  conclusive.  Cl assi f icat ion  accuracies 
are  in  the  range  86%  to  88%.  The  adaptively  equalized  3x3+5x5  feature 


subset  differs  from  the  5x5  feature  subset  only  by  the  inclusion  of 
L3S3,  the  ninth  and  lost  feature  to  be  added.  Analyses  with  7x7  masks 
have  shown  no  significant  improvement.  Selected  statistics  again 
differ  from  one  analysis  to  another,  but  Wave  features  are  rare  and 
Undulation  features  are  absent.  The  consistent  inclusion  of  R5R5  is 
somewhat  surprising  since  matching  image  structures  must  be  quite 
rare.  This  mask  resembles  a  two-dimensional  sine  or  Bessel  function. 
The  similar  S5S5  feature  is  individually  very  strong,  but  has  little 
power  when  combined  with  other  features.  Apparently  it  measures  the 
same  texture  energy  as  R5R5. 

Combining  one-dimensional  and  two-dimensional  features  improves 
classification  accuracy  very  little.  Two-dimensional  features  enter 
the  models  first,  followed  by  a  few  of  the  longer  vector  features. 
Again  there  are  few  Wave  and  no  Undulation  features,  despite  the  fact 
they  are  individually  strong  discriminating  features.  Otherwise  the 
selection  seems  somewhat  arbitrary. 

Two  sets  of  filter  masks  have  been  found  which  work  well? 
separable  square  masks  and  rotated  vector  masks.  Very  likely  the 
human  visual  system  uses  rotated  circular  masks.  For  digital  image 
processing  the  square  masks  are  the  most  convenient.  This  dataset 
requires  only  the  Level,  Edge,  Spot,  and  Ripple  5x5  masks. 
Class i f icat ion  of  15x15  blocks  can  be  done  with  accuracies  above  86% 
using  just  the  Level,  Edge,  and  Ripple  or  the  Level,  Spot,  and  Ripple 
subset  s. 

Results 

The  Level  (or  L5L5)  texture  energy  transform  is  sensitive  to 
changes  in  luminance  level.  The  moving-window  average,  or  AVE 
statistic,  can  be  used  as  a  brightness  measure  for  segmentation 
purposes.  Standard  deviation,  or  SDV,  can  be  used  as  a  local  contrast 
measure.  The  other  filters  are  inherently  insensitive  to  low 
frequency  luminance  changes,  and  can  be  made  invariant  to  contrast 
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changes  by  taking  the  ratio  of  ABSAVE  values  to  the  L5L5  SDV  values. 
This  normal i zat ion  redu'es  c 1  ass i f i ca t i on  accuracy  with  the  zero-sum 
filters  by  about  two  percentage  points.  The  cont r ast - i nv ar i ant 
features  may  be  used  to  segment  or  classify  image  textures  without 
prior  histogram  equal  i  zat ion .  Averaging  of  ABSAVE  values  from  rotated 
filter  masks,  such  as  L5E5  and  E5L5,  can  be  used  as  rot  at  ion-  i  nvar  i  ant 
texture  measures.  The  usefulness  of  such  measures  depends  on  the 
application  area.  In  a  general  vision  system  it  is  better  to  use 
directional  measures,  and  to  allow  a  higher  level  processor  to  decide 
which  textures  ate  equivalent. 

We  have  applied  the  con t r ast - i nva r i an t  transforms  to  the 
composite  texture  image.  Figure  2  shows  the  first  two  principle 
component  planes,  scaled  and  histogram  equalized.  The  third  plane 
(not  shown)  is  similar  to  the  second  with  contrast  in  the  first 
quadrant  reversed.  The  discriminant  dimensions  are  the  same  ones 
found  with  co-occurrence  features  and  with  every  other  texture  set  we 
have  tried. 

Figure  2d  shows  the  results  of  classifying  every  pixel  into  one 
of  the  eight  texture  categories.  Coefficients  of  the  classi f icat ion 
functions  were  derived  from  1156  samples  per  texture,  or  approximately 
1  250  of  the  possible  15x15  sample  windows.  Twelve  texture  transforms 
from  the  5x5  Level,  Edge,  Spot  ,  and  Ripple  subset  were  used. 
Processing  time  was  about  25  minutes  on  a  PDP  KL/10. 

It  can  be  seen  that  the  large  blocks  of  Wool,  Water,  and  Wood  are 
almost  perfectly  classified.  Grass  is  the  most  poorly  classified, 
with  only  25*  accuracy.  Most  of  the  misclassi f ied  Grass  pixels  are 
labeled  Leather,  although  other  labels  are  also  common.  Sand  and 
Pigskin  are  also  confused.  The  32x32  blocks  are  very  well  separated, 
although  it  may  be  difficult  to  tell  in  this  gray-scale  picture.  Even 
the  16x16  blocks  are  differentiated  to  an  extent.  Visual  appearance 
could  be  enhanced  by  leaving  "doubtful”  pixels  unc 1  ass i f i ed . 
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2.3  Automatic  Generation  of  Natural  Texture  Descriptions 
F.  Vilnrotter,  R.  Nevatia  and  K.  Price 


I Dll 2^22 2122 

Previously  [1,2]  we  reported  on  some  initial  experiments  with  a 
method  for  generating  descriptions  of  natural  textures.  This  report 
will  review  the  basic  method  and  present  the  actual  description  system 
in  more  detail.  A  historical  discussion  is  presented  in  [1  and  21  and 
is  not  included  in  this  report. 

Our  goal  in  texture  description  is  to  automat ical ly  generate 
descriptions  of  texture  patterns  which  are  similar  to  those  generated 
by  people  and  can  also  be  used  as  a  part  of  the  symbolic  description 
of  objects  in  a  scene.  As  a  first  step  in  this  description  system  we 
have  looked  at  the  problem  of  describing  regular  texture  patterns. 
After  some  initial  exper imentat ion  we  settled  on  studying  how  edge 
elements  repeat  in  the  texture  area.  Edges  are  less  affected  by 
variations  caused  by  noise  and  small  intensity  changes  than  the 
original  image  value. 


The  basic  operation  on  an  edge  image  is  to  compute  an  edge 
repetition  array  (essentially  the  binary  case  of  a  gray  level 
co-occurrence  matt  tx)  .  These  array  values  are  counts  of  the  number  of 
times  an  edge  occurs  at  a  point  and  at  another  point  at  a  given 
distance  and  direction.  We  use  4  directions  with  the  distance  ranging 
from  2  to  32  (more  in  some  cases) .  Several  variations  on  the  basic 
computation  are  required,  involving  the  direction  of  the  edge  (the 
edge  detector  produces  edges  quantized  to  8  directions).  When  looking 
in  the  horizontal  direction  the  only  edges  which  would  correctly 
indicate  the  boundary  of  a  texture  element  would  be  edges  in  a 
vertical  direction.  Correspondingly  in  other  scan  directions  only  the 
perpendicular  edge  directions  are  used.  Other  restrictions  on 
allowable  directions  can  produce  useful  results.  Consider  a  texture 
field  composed  only  of  vertical  stripes.  Each  stripe  would  have  an 
edge  on  each  side,  one  going  up,  one  down.  Tf  the  arrays  are  computed 
using  only  those  pairs  with  opposite  directions  (up  and  down)  then 
there  should  be  large  values  in  the  repetition  array  at  distances 
equal  to  the  size  of  the  stripes  (to  distinguish  intensity  of  the 
stripes,  down  with  up  is  separate  from  up  with  down  -  later  referred 
to  as  the  dark  and  light  totals).  Additionally,  if  exactly  the  same 
directions  are  required  then  the  fE.acj.ng  of  pairs  of  elements  should 
have  large  values  in  the  repetition  arrays.  Finally  the  totals  are 
normalized  based  on  the  total  number  of  edges  in  the  window  in  the 
given  directions 

Example 

A  raffia  subwindow  and  edge  image  along  with  their  corresponding 
element  spacing  and  element  size  graphs  are  given  in  Figure  1.  Figure 
1  c  and  d  each  contain  4  qraphs,  1  for  each  of  the  4  scan  directions: 
horizontal,  45  degrees,  vertical,  and  135  degrees.  The  solid  line 
represents  dark  elem  nt  information  while  the  broken  line  graph 
represents  light  element  information.  The  most  significant  results  in 
our  example  occur  along  the  vertical  scan  direction;  there  is  a  sharp 
peak  at  8  and  its  approximate  multiples  for  both  light  and  dark 
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objects  in  the  vcitical  direction  element  spacing  qraph.  The  vertical 
element  size  graph  shows  peaks  at  2  and  6  repeat inq  at  intervals  of  8. 
One  could  conclude  that  dark  objects  of  element  size  2  and  light 
objects  of  element  size  6  repeat  at  intervals  of  8. 

The  program  (to  be  discussed  in  detail  below)  calculates  edge 
repetition  arrays  from  the  edge  and  direction  images  of  a  texture 
subwindow.  It  then  interprets  these  results  not inq  evidence  of 
periodicity,  element  spacing  and  ot  predominant  element  size,  and 
their  relative  strength.  These  descriptions  are  generated  for  both 
dark  and  light  objects  in  each  of  4  scan  directions. 

The  methodology  of  the  program  is  summarized  below: 

1)  .  The  element  spacing  graph  is  examined  to  determine  if  there  are 
any  element  spacing  peaks  which  repeat,  i.e.,  if  there  are  major  or 
minor  peaks  (as  defined  later)  at  multiples  of  a  given  element  spacing 
v  a  1  ue  . 

2) .  The  element  size  graph  is  examined  to  determine  if  there  is 
support  for  any  potential  element  spacing  value.  Support  is 
established  for  element  spacing  value,  N,  if  there  are  moderate  to 
high  peaks  at  element  size  values:  M,  M*N,  M+2N,...  for  some  element 
size  value  M.  if  such  support  exists  then  it  can  be  said  that  there 
is  evidence  of  periodicity  with  element  spacing  N;  and  there  is 
evidence  that  elements  of  size  M  repeat  with  spacing  N. 

3) .  If  periodicity  cannot  be  established  then  the  element  size  graph 
is  searched  for  predominant  element  sizes. 

The  remainder  of  this  report  will  be  a  descriptive  outline  of  the 
program's  2nd  part.  This  program  is  not  in  its  final  foim,  and  all 
results  are  preliminary. 


The  major  program  sections  are  as  follows: 


1) .  Gene£at^_on  of  Ed<je  B2E2£.i.!:A22  “  The  description  of  this 
section  was  provided  in  a  previous  report  [1], 

2) .  Construct  ion  of  PEAKZVALLEY  SIGNATURE  -  A  peak-valley  signature 
is  constructed  for  each  of  the  graphs.  It  has  l's  in  places  where 
peaks  occur,  -l's  in  places  where  Valleys  occur  and  0's  in  places 
which  are  hillsides  -  for  example,  the  vertical  direction,  solid  line 
graph  in  Fig.  lc  has  signature  -1,  0,  0,  0,  0,  0,  1,  0,  0,  0,  -] ,  0, 
0,  0,  1  for  values  2  thru  16.  Later  in  the  program,  peaks  are  further 
classified  as  major,  minor  or  subminor  end  the  l's  in  the  signature 
are  adjusted  to  reflect  this  class i f icat ion  by  being  set  to  3,  2,  or  1 
respectively.  This  signature  is  used  to  provide  easy  access  to 
location,  width  and  classification  of  peaks. 

3)  .  Determine  solitude  descr  iptor  s  and  constant  £12111®  ~  The 
amplitude  of  each  point  on  the  graph  is  classified  as  Very  Strong, 
Strong,  Moderate,  Weak  or  Very  Weak.  These  classifications  are 
assigned  according  to  amplitude  ranges,  e.g.  a  point  on  the  graph 
which  has  amplitude  greater  than  15  would  be  assigned  VERY  STRONG  as 
its  amplitude  descriptor,  a  point  whose  amplitude  lies  between  10  and 
15  would  be  assigned  STRONG,  etc.  These  descriptors  are  stored  in  the 
amplitude  descriptor  matrix.  The  constant  profile  of  a  graph  is  a 
descriptor  which  categorizes  a  graph  in  terms  of  how  closely  it 
approximates  a  constant.  Both  of  the  above  are  used  in  the  final 
(description  generation)  section. 

The  constant  profile  of  the  vertical  broken  line  graph  in  figure 
lc  is  0  since  this  graph  does  not  approximate  a  constant.  However, 
the  135  degrees  broken  line  graph  has  a  constant  profile  of  4  since  it 
approximates  a  constant  fairly  well. 

4) .  Collect  Peak  Informat  ion  -  The  following  information  is 
catalogued  for  each  peak  for  use  later  in  the  program: 


a)  . 

location  of  left  valley, 

b)  . 

location  of  right  valley. 

c)  . 

difference  in  amplitude  between 

peak 

a  nd 

left  val ley, 

d)  . 

difference  in  amplitude  between 

peak 

and 

right  valley, 

e)  . 

peak  width, 

f)  . 

valley  to  valley  width, 

g)  • 

ampl  i  t  ude  , 

list  of 

peaks  is  also  formed  in  this 

sect  ion 

to  fac i 1 i tate 

sequential  examination  of  peaks.  Peaks  with  maximum  peak-valley 
variation  (the  maximum  of  c  and  d  above)  greater  than  a  predetermined 
floor  value  are  labelled  minor  peaks  (later  some  of  these  peaks  may  be 
determined  to  be  major  peaks).  All  others  are  subminor  peaks. 


5).  Extract  Major  Peaks  -  A  list  of  major  peaks  is  formed  in  this 
section.  A  peak  is  considered  to  be  major  if: 


a)  .  its  amplitude  is  above  a  predefined  floor  value, 

b)  .  MIN  left  peak-valley  amplitude  difference,  right  peak-valley 

amplitude  difference  is  above  a  predefined  floor  value,  and 

c)  .  its  amplitude  is  above  a  predetermined  percentage  of  the 

maximum  graph  value. 

In  Fig.  lc  there  is  only  one  broken  line  graph  having  major  peaks.  It 
is  the  vertical  direction  broken  line  graph  and  its  major  peaks  occur 
at  8,  17,  25  and  26. 


6).  Determine  repet i t  ion  descpjptops  -  In  the  first  part  of  this 
section  the  element  spacing  graph  is  used  while  the  second  part  uses 
element  size  information. 


a)  .  Use  of  element  spaejnej  2£pph  -  Peaks  are  classified  as 
major,  minor  and  subminor.  Subminor  peaks  are  too  weak  to  be  used  in 
this  section. 
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The  lirst  major  peak  is  considered  as  a  possible  element  spacinq 
value.  Checks  ate  made  to  determine  if  peaks  occur  at  mult iples  of 
this  value.  Then  the  appt  opt iate  desctiptoi  is  assigned  as  follows: 

3  -  This  desct iptot  is  assigned  it  each  multiple  ot  the  value 
being  considered  is  the  locat  ion  of  a  major  peak.  (At  least  one 
repetition  must  be  found.).  •' 

2  -  This  descriptor  is  assigned  it  at  least  one  multiple  of  the 
value  being  considered  is  the  locat ion  01  a  majot  peak  and  the  rest 
are  at  least  minot  peaks.  (At  least  1  lopet  it  ion  must  be  found)  . 


Same 

as 

t  O  l 

n 

4. 

with  t  he  except  i on 

that  one  of  the 

t  epe  t  it  i  ve 

m  i  s 

s  t  ng 

. 

Tn 

this  case  at  least 

1  t  epe  t  i  t  i ons  must 

bo  found. 

non  e 

o  t 

t  hr 

above  cases  apply. 

If  the  desctiptoi  is  greater  than  zeto  all  repetitions  ot  the 
peak  presently  being  considered  are  tagged.  The  next  peak  to  be 
consideted  is  the  next  untngged  m.  ioi  peak. 

The  process  continues  until  all  untngged  majot  peaks  ate 
cat t  got  i zed . 

b) .  Use  ot  element  size  graph  -  The  element  size  gtaph  is  used 
to  lend  support  to  the  evidence  found  in  a. 

Element  spacing  values  arc  considered  in  otdet  ot  decreasing 
positive  repetition  desct iptot  value,  say  01,  determined  in  la.  It  it 
is  observed  that  peaks  occut  at  locations  M,  M+N,  M+2N,...  fot 
clement  size  M  and  element  spacing  value  N  then  the  size  repet  it  ion 
desct iptot  ,  say  02,  fot  N  is  set  to  1,  2,  1  ot  0  as  explained  in  a. 


NOTE:  When  repetitions  ate  being  sought  a  matgin  of  l.S  to  the  left 
and  right  is  given. 


If  the  descriptor  produced  while  considering  element  size  M  is 
greater  than  zero  repetitions  of  the  peak  namely  M+N,  M+2N,...  are 
both  temporarily  and  permanently  tagged.  For  a  given  N,  D2  is 
maximized  over  element  size  peaks  which  are  not  temporarily  tagged. 
After  all  element  size  peaks  which  are  greater  than  subminor  peaks  and 
which  are  not  temporarily  tagged  are  considered  the  temporary  tags  are 
reset  and  the  next  element  spacing  value  is  considered. 

If  the  descriptor  produced  while  considering  element  size  M  is 
greater  than  zero  then  element  size,  M,  is  assigned  an  ordered  pair, 
say  <S1,S2>,  where  SI  is  the  spacing  value  N  at  which  the  size  value  M 
repeats  and  S2  is  the  repetition  descriptor  currently  produced.  SI 
and  S2  are  reset  only  if  the  current  descriptor  produced  is  greater 
than  the  current  value  of  S2,  i.e.,  S2  is  maximized  over  element 
spacing  values  with  positive  Dl. 

The  repetition  descriptors  Dl  and  D2  associated  with  untagged 
element  spacing  peaks  as  well  as  the  ordered  pairs  <S1,S2>  associated 
with  non-permanently  tagged  element  size  peaks  are  considered  when 
descriptions  are  being  generated  in  the  interpretation  section 
described  below. 

For  the  set  of  graphs  in  Figs,  lc  and  d  the  vertical  element 
spacing  value  8  has  Dl=D2=3  for  both  light  and  dark  objects.  The 
vertical  element  size  values  2  (for  dark  objects)  and  6  (for  bright 
objects)  have  <S1 ,  S2>  =  ^8 , 3> 

7).  Interpretat ion  Sect  ion  -  A  simplified  outline  of  this  section  is 
given  in  Fig.  2.  If  all  of  the  case  statements  and  nested 
conditionals  were  expanded  this  section  would  essentially  consist  of  a 
set  of  production  rules  of  the  form: 

IF (CONDI  AND  C0ND2  AND... AND  CONDN ) 

THEN  PRINT ( "APPROPRIATE  DESCRIPTIVE  MESSAGE"). 
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A  small  subset  of  these  expanded  production  rules  follows: 


Ru_le  I:  IF(Element  size  graph  closely  approximates  a  constant) 

THEN  PRINT("No  evidence  of  periodicity  or  predominant 
element  size" ) . 

RuJ.e  2:  IF(Element  spacing  graph  closely  approximates  a  constant) 

AND  (Element  size  graph  has  constant  profile  0) 

AND  (There  exists  a  major  peak  at  element  size  value  N  with 
amplitude  descriptor  STRONG) 

THEN  PRINTC'No  evidence  of  periodicity.  STRONG  evidence  of 
element  size  of  N" )  . 

RuJ_e  3:  IF(Element  spacing  graph  has  constant  profile  0) 

AND  (Element  size  graph  has  constant  profile  0) 

AND  (There  is  a  major  peak  at  element  spacing  value,  N, 
having  amplitude  descriptor  STRONG) 

AND  (Element  Spacing  repetition  descriptor,  D1 ,  for  N  is  3) 
AND  (Element  size  repetition  descriptor,  D2,  for  N  is  3) 

THEN  PRINT  ("There  is  VERY  STRONG  evidence  of  periodicity 
with  element  spacing  N")  . 

Rule  4:  IF  (Element  spacing  graph  has  constant  profile  0) 

AND  (Element  size  graph  has  constant  profile  0) 

AND  (There  is  a  major  peak  at  element  size  value  N) 

AND  (Amplitude  descriptor  for  N  is  at  least  STRONG) 

AND  (N  is  not  permanently  tagged) 

AND  ( S 2  associated  with  N  is  3) 

THEN  PRINT  ("There  is  STRONG  evidence  of  element  size  N  with 
STRONG  support  for  element  spacing  SI"). 

The  constant  profile  of  a  graph  is  defined  in  the  description  of 
program  section  3.  Repetition  descriptors  Dl ,  D2,  (SI,  S2)  are 


defined  in  the  description  of  program  section  6.  Tagging  is  also 
discussed  in  that  section.  Since  the  program  is  not  in  final  form  use 
of  the  rules  listed  above  is  tentative. 

The  description  generated  for  the  set  of  graphs  given  in  figure  1 
is  g iven  in  figure  3 . 

Results 

The  mosaic  pictured  in  figure  4a  contains  2  sample  images  of 
raffia  in  the  top  row  and  2  sample  images  of  herringbone  material  in 
the  2nd  row.  Figure  4b  contains  the  corresponding  edge  images  for 
these  samples.  The  highly  regular  patterns  of  raffia  and  herringbone 
material  produces  strong  periodic  patterns  in  the  plots  of  the  edge 
repetition  arrays  given  in  Figures  5-8  a  and  b. 

The  most  significant  results  for  raffia  are  seen  in  the  vertical 
scan  direction.  Major  peaks  repeat  at  multiples  of  8  or  9  in  the 
vertical  element  spacing  graphs  (Figs.  5a,  6a).  Likewise  peaks  at  2 
(for  dark  objects)  and  6  (for  light  objects)  repeat  at  increments  of 
8  in  the  vertical  element  size  graphs  (Figs.  5b,  6b).  This  would 
tend  to  indicate  that  dark  elements  of  size  2  repeat  with  a  spacing  of 
8;  and  that  light  elements  of  size  6  repeat  with  a  spacing  of  8. 
The  automatic  descriptions  for  raffia  given  in  Figures  5c,  6c  report 
similar  evidence.  This  is  consistent  with  the  results  for  the  sample 
of  raffia  discussed  earlier  (see  Figures  1,  3).  Less  significant  is 
the  periodic  result  found  in  the  horizontal  scan  direction  for  dark 
objects.  Dark  objects  with  widths  of  roughly  between  7  and  Q  repeat 
with  element  spacing  11  (Fig.  5a, b  and  6a, b) .  The  horizontal  dark 
element  graphs  lend  strong  support  to  regularity.  However,  the 

amplitude  of  the  peaks  in  the  clement  size  graphs  is  so  low  that 
element  size  evidence  is  considered  weak  (Figs.  5,6b  and  c) .  The  only 
other  evidence  worth  noting  in  the  horizontal  scan  direction  is  the 
evidence  of  element  size  2  for  light  objects.  No  evidence  of 
periodicity  is  found  for  light  objects. 
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FILENAME  =  RAFFIA . TST 

DARK  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 


THERE 

IS 

STRONG 

EVIDENCE 

OF 

PERIODICITY 

WITH 

ELEMENT 

SPACING 

11.00000 

THERE 

IS 

STRONG 

EVIDENCE 

OF 

PERIODICITY 

WITH 

ELEMENT 

SPACING 

12.00000 

THERE  IS  WEAK  EVIDENCE  OF  ELEMENT  SIZE  10.00000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.00000 

45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
VERTICAL  SCAN  DIRECTION 

THERE  IS  VERY  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  8.000000 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  2.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  8.000000 

135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 

LIGHT  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY 

MODERATE  EVIDENCE  OF  ELEMENT  SIZE  OF  2.000000 
45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
VERTICAL  SCAN  DIRECTION 

THERE  IS  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  8.000000 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  6.000000 
WITH  STRONG  SUPPORT  FUR  ELEMENT  SPACING  8.000000 

135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 

Fig.  3.  Automatic  Description  of  Raffia  in  Fig.  1 


4  3 


a)  4  Subwindows  for  Texture  Analysis,  Raffia  on  Top, 
Herringbone  Material  on  Bottom 


b)  Non-maximal  Suppressed  Edges  from  a 


Fig.  4.  MOSAIC  1 
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FILENAME  =  RAFFIA. TST22 

DARK  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  11.00000 

THERE  IS  WEAK  EVIDENCE  OF  ELEMENT  SIZE  7.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.00000 

THERE  IS  WEAK  EVIDENCE  OF  ELEMENT  SIZE  8.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.00000 

THERE  IS  WEAK  EVIDENCE  OF  ELEMENT  SIZE  9.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.00000 

45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
VERTICAL  SCAN  DIRECTION 

THERE  IS  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  8.000000 

THERE  IS  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  9.000000 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  2.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  8.000000 

135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
LIGHT  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  2.000000 
45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
VERTICAL  SCAN  DIRECTION 

THERE  IS  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  9.000000 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  6.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  9.000000 

135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 

c) .  Automatic  Description  of  RAFFIA. TST22 

Fig.  5.  CONTINUED 
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FILENAME  RAFFIA. TST24 

DARK  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  11.00000 

THERE  IS  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  12.00000 

THERE  IS  WEAK  EVIDENCE  OF  ELEMENT  SIZE  7.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.000000 

THERE  IS  WEAK  EVIDENCE  OF  ELEMENT  SIZE  9.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.000000 

45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
VERTICAL  SCAN  DIRECTION 


THERE 

IS 

VERY 

STRONG 

EVIDENCE 

OF 

PERIODICITY 

WITH 

ELEMENT 

SPACING 

8 . 000000 

THERE 

IS 

VERY 

STRONG 

EVIDENCE 

OF 

PERIODICITY 

WITH 

ELEMENT 

SPACING 

9.000000 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  2.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  8.000000 

115  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 

L I  GUT  OBJECT  DESCR I PT IONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  2.000000 
45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
VERTICAL  SCAN  DIRECTION 

THERE  IS  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  8.000000 

THERE  IS  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  9.000000 

THERE  IS  STRONG  EVIDENCE  OE  ELEMENT  SIZE  0. 000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  8.000000 

1  15  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 

c)  Automatic  Description  of  RAFFIA. TST24 
Fig.  CONTINUED 
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The  most  significant  results  for  herringbone  material  are  seen  in 
the  horizontal  directions.  Major  element  spacing  peaks  repeat  at 
multiples  of  11  and  13  for  both  light  and  dark  objects  in  the 
horizontal  and  vertical  scan  directions  respectively  (see  Figs.  7a, 
8a).  The  element  size  graphs  for  herringbone  material  show  that  there 
are  2  predominant  element  size  peaks  which  repeat  at  intervals  of  11 
in  the  horizontal  scan  direction.  These  peaks  are  found  roughly  at 
element  size  values  4  and  8  for  dark  objects  and  3  and  9  for  light 
objects  (see  Figs.  7b,  8b).  Likewise,  there  are  2  predominant  element 
size  peaks  which  repeat  at  intervals  of  13  in  the  vertical  scan 
direction.  These  peaks  are  found  roughly  at  element  size  values  3  and 
10  for  dark  objects  and  3  and  9  for  light  objects  (see  Figs.  7b,  8b). 
The  automatic  descriptions  of  herringbone  material  given  in  Figures 
7c,  8c  indicate  that  there  is  evidence  that  the  herringbone  material 
samples  are  regular  with  spacing  11  in  the  horizontal  scan  direction 
and  13  in  the  vertical  scan  direction;  and  that  the  predominant 
element  sizes  for  dark  and  light  objects  in  these  scan  directions  are 
found  at  approximately  element  size  values  3  and  10. 


The  mosaic  pictured  in  figure  9a  contains  2  sample  images  of  wood 
in  the  top  row  and  1  sample  each  of  water  and  San  Francisco  in  the 
bottom  row.  Figure  9b  contains  the  corresponding  edge  images  for 
these  samples. 


The  most  significant  results  for  wood  and  water  occur  in  the 
horizontal  scan  direction.  The  most  salient  characteristic  of  the 
horizontal  graphs  of  these  samples  in  the  major  peak  occurring  at 
element  size  value  2.  (see  Figs.  10b,  lib,  12b).  Each  of  the 
automatic  descriptions  (Figs.  10c,  11c,  12c)  note  strong  evidence  for 
element  size  2  for  for  dark  and  light  object  in  the  horizontal  scan 
direction.  An  element  spacing  value  of  4  is  strongly  indicated  for 
both  dark  and  light  objects  in  the  horizontal  scan  direction  of  the 
water  subwindow;  while  only  1  of  the  wood  subwindow  descriptions 
indicates  evidence  of  periodicity  and  this  only  for  light  object  in 
the  horizontal  scan  direction. 


-  — '  - 
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FILENAME 


HBONE.TSTl 


DARK  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  VERY  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  11.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  4.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  8.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.00000 

45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
VERTICAL  SCAN  DIRECTION 

THERE  IS  VERY  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  13.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  3.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  13.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  4.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  13.00000 

THERE  IS  WEAK  EVIDENCE  OF  ELEMENT  SIZE  5.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  13.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  11.00000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  13.00000 

135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 

LIGHT  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  11.00000 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  3.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.00000 

THERE  IS  WEAK  EVIDENCE  OF  ELEMENT  SIZE  9.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.00000 

45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
VERTICAL  SCAN  DIRECTION 

THERE  IS  VERY  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  13.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  3.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  13.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  9.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  13.00000 

135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY 

WEAK  EVIDENCE  OF  ELEMENT  SIZE  OF  4.242041 

WEAK  EVIDENCE  OF  ELEMENT  SIZE  OF  12.72792 

c)  Automatic  Description  of  HBONE.TSTl 

Fiq.  7.  CONTINUED 
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HB0NE.TST2  Edge  Repetition  Results  and 
Automatic  Descrint ion 


FILENAME  =  HBONE . TST2 
DARK  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  VERY  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  11.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  4.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  8.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  9.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.00000 

45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
VERTICAL  SCAN  DIRECTION 

THERE  IS  VERY  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  13.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  3.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  13.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  4.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  13.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  10.00000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  13.00000 

135  DEGREE  SCAN  DIRECTION 

THERE  IS  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  8.485282 

THERE  IS  WEAK  EVIDENCE  OF  ELEMENT  SIZE  4.242641 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  8.485282 

LIGHT  OBJECT  DESCRIPTIONS 

HORIZONTAL  SCAN  DIRECTION 

THERE  IS  VERY  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  11.00000 

THERE  IS  VERY  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  12.00000 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  3.000000 
WITH  STRONG  SI  .'PORT  FOR  ELEMENT  SPACING  11.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  8.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  11.00000 

45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY 

WEAK  EVIDENCE  OF  ELEMENT  SIZE  OF  2.828427 

VERTICAL  SCAN  DIRECTION 

THERE  IS  VERY  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  13.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  3.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  13.00000 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  ''•.  000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  13.00000 

135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY 

WEAK  EVIDENCE  OF  ELEMENT  SIZE  OF  4.242641 

WEAK  EVIDENCE  OF  ELEMENT  SIZE  OF  12.72792 

c)  Automatic  Description  of  HBONE. TST2 
Fie.  8.  CONTINUED 
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a)  WOOD . TSTl  Element  Spacing  Graphs 


b)  WOOD. TSTl  Element  Size  Graphs 


Fig.  10.  WOOD. TSTl  Edge  Repetition  Results  and 
Automatic  Description 
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FILENAME  =  WOOD.TSTl 


DARK  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  2.000000 
45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 

VERTICAL  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY 

WEAK  EVIDENCE  OF  ELEMENT  SIZE  OF  3.000000 

135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 

LIGHT  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  2.000000 
THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  20.00000 
45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
VERTICAL  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY 

WEAK  EVIDENCE  OF  ELEMENT  SIZE  OF  2.000000 
135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 


c)  Automatic  Description  of  WOOD.TSTl 


Fig.  10.  CONTINUED 
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F 1 1.  ENA  ME 


WOOD. TST3 


DARK  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  2.000000 
45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 

VERTICAL  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY 

WEAK  EVIDENCE  OF  ELEMENT  SIZE  OF  2.000000 

135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 

LIGHT  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  VERY  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  7.000000 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  2.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  7.000000 

45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 

VERTICAL  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY 

WEAK  EVIDENCE  OF  ELEMENT  SIZE  OF  2.000000 

135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 


c)  Automatic  Description  of  WOOD.TST3 


Fiu.  11.  CONTINUED 
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OEPT  253*'  2&4'~ 


b)  H20.TST1  Element  Size  Graphs 


Fig.  12.  H20.TST1  Edge  Repetition  Results  and 
Automatic  Description 


FILENAME  =  H20.TST1 


DARK  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  VERY  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  4.000000 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  2.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  4.000000 

45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
VERTICAL  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 

LIGHT  OBJECT  DESCRIPTIONS 
HORIZONTAL  SCAN  DIRECTION 

THERE  IS  VERY  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SAPCING  4.000000 

THERE  IS  STRONG  EVIDENCE  OF  ELEMENT  SIZE  2.000000 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  4.000000 

45  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
VERTICAL  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 


c)  Automatic  Description  of  H20.TST1 
Fig.  12.  CONTINUED 
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As  one  might  expect  from  looking  at  the  San  Francisco  subwindow 
(Fig.  ‘la)  the  most  significant  gtaphic  results  occur  in  the  45  degree 
scan  direct  ion.  Element  spacing  peaks  repeat  at  multiples  of  ~  1 5 .  5 
tot  light  and  dark  objects  in  this  scan  direction  (Fig.  11a) .  Support 
tor  this  element  spacing  value  is  given  by  the  element  size  graphs  for 
the  45  degree  scan  direct  ion  given  in  Fig.  13b)  .  Moderate  peaks  at  ~4 
for  dark  objects  and  ~10  tor  light  objects  repeat  at  intervals  of 
roughly  15.5. 

The  automatic  description  generated  for  the  San  Francisco 
subwindow  (Fig.  13c)  notes  evidence  of  periodicity  with  element 
spacing  value  ~15.5.  It  also  notes  that  dark  objects  of  size  ~4  and 
light  objects  of  ~9.8  repeat  with  element  spacing  15.5. 

Cone  1  us  ions 

The  results  presented  here  ate  preliminary,  but  hopefully 
indicate  a  step  towards  useful  structural  deset i pt ions.  Tn  future 
work,  we  hope  to  improve  the  description  process  and  evaluate  the 
result  i ng  deset i pt ions  using  more  test  samples. 
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a)  SFOR.TST  Element  Spacing  Graphs 


b)  SFOR.TST  Element  Size  Graphs 


g.  13.  SFOR.TST  (San  Francisco)  Edqe  Repetition  Results 
and  Automatic  Description 


FILENAME 


SFOR. TST 


DARK  OBJECT  DESCRIPTIONS 

HORIZONTAL  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY 

WEAK  EVIDENCE  OF  ELEMENT  SIZE  OF  4.000000 

45  DEGREE  SCAN  DIRECTION 

THERE  IS  MODERATE  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  15.55635 

THERE  IS  MODERATE  EVIDENCE  OF  ELEMENT  SIZE  4.242641 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  15.55635 

VERTICAL  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 


LIGHT  OBJECT  DESCRI PTIONS 
HORIZONTAL  SCAN  DIRECTION 


THERE  IS  WEAK  EVIDENCE  OF  ELEMENT  SIZE  4.000000 
THERE  IS  WEAL  EVIDENCE  OF  ELEMENT  SIZE  12.00000 
45  DEGREE  SCAN  DIRECTION 

THERE  IS  STRONG  EVIDENCE  OF  PERIODICITY  WITH  ELEMENT  SPACING  15.55635 

THERE  IS  WEAK  EVIDENCE  OF  ELEMENT  SIZE  4.242641 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  15.55635 


THERE  IS  WEAK  EVIDENCE  OF  ELEMENT  SIZE  9.899495 
WITH  STRONG  SUPPORT  FOR  ELEMENT  SPACING  15.55635 

VERTICAL  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY  OR  PREDOMINANT  ELEMENT  SIZE 
135  DEGREE  SCAN  DIRECTION 

NO  EVIDENCE  OF  PERIODICITY 

MODERATE  EVIDENCE  OF  ELEMENT  SIZE  OF  4.242641 


c)  Automatic  Description  of  SFOR. TST 


Fig.  13 


CONTINUED 


2.4  Probability  Density  Function  of  the  Sinqular  Values  of  a  Random 
Texture  Field 


Behnam  Ash j ar  i 


Singular  values  of  a  sample  matrix  or  picture  can  be  considered 
as  descriptor  or  features  of  the  matrix  or  picture  elements  and  their 
inter-relationships.  Although  singular  value  decomposition,  in  the 
recent  years,  has  gained  applications  and  importance,  there  has  not 
been  any  statistical  treatment  of  the  subject.  What  follows  is  the 
results  of  a  systematic  treatment  of  this  subject.  For  details  and 
derivations,  the  reader  can  refer  to  [11. 

§iD22li?I  Vajlue  Decompos  i  t  ion 

A  general  kxn  matrix  F  can  be  decomposed  into  three  matrices  as 
fol lows : 


T 

F  U  S  V 

where,  U  is  a  kxk  orthogonal  matrix  such  that 

T 

U  U  lk 


(1 ) 


(2) 


and  V  is  a  nxk  orthogonal  matrix  such  that 

V  VT  =  In  (3) 

and,  S  is  a  diagonal  matrix  whose  elements  are  real,  and  positive  and 
are  called  singular  values  of  F.  If  the  elements  of  S  are  arranqed 
according  to  a  descending  order,  then  the  decompos i t i on  (1)  will  he 
unique.  It  can  be  shown  that 


(4) 


T  T 

F  F 1  =  U  A  u 


and  , 


FT  F  =  V  A  VT 


where, 


A  = 


S 


2 


(5) 


(6) 


ZonaJ.  Polynomials  of  the  Group  Rf P£osent a t j_on  Theory 

The  concept  of  zonal  harmonics  of  a  positive  definite  symmetric 

matrix  has  been  introduced  in  [2].  A  detailed  study  of  this  concept 

has  been  presented  in  f  3 ] -  Consider  a  KxK  positive  definite, 

symmetric  matrix  A.  Let  V  be  the  vector  space  of  homogeneous 

P 

polynomials  of  degree  P  in  the  distinct  elements  of  A.  Tt  can  be 
proved  that  Vp  decomposes  into  a  direct  sum  of  irreducible  invariant 
subspaces  is  a  partition  of  P  into  not  more  than  k  parts  such 

that  p  p+p2+ .  .  . +pk=p  and  P^>P2  >  •  •  For  every  value  of  p,  there  are 

a  number  of  possible  partitions  shown  as  {f-}.  The  polynomial 

P 

(trA)  eVp  has  a  unique  decompos i t ion  into  polynomials 
represented  as  in  (7): 


ytWfi 


( trA)  P  =  X)  C.  (A)  <7) 

ft  P 

where,  C^(A)  is  the  zonal  polynomial  of  matrix  A  with  respect  to  the 
partition  p  of  P  and  E  means  summation  over  {/*},  Zonal  polynomials 
can  be  obtained  analytically  and  in  the  following  derivations,  they 
can  be  evaluated  computationally  up  to  a  desired  degree  of  accuracy. 
For  a  more  detailed  introduction  to  the  zonal  representation  of  a 
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symmetric  space,  the  reader  is  referred  to  [1], 

fyD£ii2D  2i  z 

Let  F  be  a  kxn  (k<n)  matrix  whose  elements  are  normally 
distributed  with  correlation  in  column  and  row  directions.  Let  Kc  and 
Kr  be  covariance  matrices  for  column  and  row  directions.  And,  let 

g(F)  =  p.d.f.  of  F  ( 8 ) 


and  , 


dG(F)  ^  distribution  of  F  *  g(F)d(F) 
Then,  p.d.f.  of  F  is  obtained  as 


(9) 


(10) 


(2n) 


|K 


'k-r 


Different  ial  Decomposition  of  F 

Using  differential  operations  on  decomposition  of  a  matrix  (41, 
and  using  equation  (1),  we  can  obtain 


dF  =  -2— 1  ■  - - —  (  n  s.)n  k  IT  (s?-s?)d(S)d(U)d(V)  (11) 

rk(|)rk(|)  i=i  Kj  J 


where,  ( x )  is  K-variate  gamma  function  of  x  and. 
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(12) 


S  =  diag (s , , s_ , . . . , s^) 


also, 


k 

d(S)  =  n  d(s.)  (13) 

i-1  1 

the  differential  form  d(U)  is  the  normalized  invariant  or  Haar  measure 
on  the  orthogonal  group  0(k)  and  the  differential  form  (dV)  is  the 
normalized  invariant  measure  on  the  stiefel  manifold  of  k-frames  in 
space  R11 . 

Eiobabi. Density  Eyi'ct^on  of  Sj.ngu2ar  Values  of  F 

Using  relation? (7)  to  (13)  and  integrating  out  U  and  V  over  the 
orthogoal  group  0  (k)  and  Stiefel  manifold  of  k-frame  in  fPw.r.t  .  the 
invariant  measures  d(U)  and  d(V).  The  joint  p.d.f.  of  the  singular 
values  of  F  is  obtained  as 


k(l-J)  k2/2 


,sR) 


1  k  ^2* 1  k  *2^  lvl  I  -K.rI 
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n72|K 

C  1  -kr 


r/2  n. (sj-s2)(  n  s.)n-k  * 


i,,-i 


V  V 

p^O  Pir. It  TrTTT  ^ 


1<D 

2, 


P,VIK,C^<In) 


i- 1 


(14) 


g  Uj , 


Joint  Probability  Densi ty  Function  of  Eigen  Values  of  FFT 


Jacobian  of  the  transformation  (6)  is  as  in  (15): 


JA  (sl' - ,sk)  ~  .I!  s  ^  =  2kdet(S)  (15) 

using  (15)  in  (14),  we  obtain  the  joint  p.d.f.  of  the  eigenvalues  of 
FF  as 


k2/2 


A  2  / 


'V 


>kn/2 


rk|-2»rk(Z|IScl"/2ll<R'iY/2(in/i’ 


(A) 


n-k-1  , 

— 2 k 

n  ( -  x . )  x 

i<j  1  1 


(16) 


Y  Y  2^°  /v'/»  -R 

P=0  P  (IK^/;Trnllry 


Probabil ity  Density  Fy  not  .ion  Ff"1  Matrix 

It  can  be  proved  that  (see  [1]);  if  p.d.f.  of  F  is  known 
T  —  f 

p.d.f.  of  FF  can  be  derived  from  it  by  using  (17): 


g(F  FT)  d(FFT) 


Using  g(F)  of  (10)  in  (17),  we  obtain  (18)  as  the  p.d.f.  of  FFT: 


kn/2  *  . -  — 

'  7~P)  ff  T  -k-1/2d(rrT) 

kV  FF 


T  T 

FF  +A (FF  ) 


(17) 
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the  important  observation  from  (18)  is  that  if,  in  (18),  we  let  K  =1  , 

T  - 1  " R  _n 

we  will  obtain  Wishart  density  W(FF  ,k,n,Kc).  Therefore,  the  Wishart 

distribution  becomes  a  special  case  of  (18).  The  importance  of  this 

result  is  in  triat  =  lft  means  independence  among  the  columns  of  F  and 

this  is  the  condition  on  whose  basis  the  Wishart  distribution  is 

derived  [5],  If  we  let  KR=In  and  column  dimension  k=l ,  then  Kc  will 

be  y“  and  (18)  will  give  us  the  p.d.f.  of  a  chi-squared  random 

variable  with  n  degrees  of  freedom. 


Probability  Density  Function  of  the  Lajrgest  Singular  Value  of  F 

Using  lemma  3.3  established  in  [6]  and  performing  the 
2 

transformation  hj=X^/X^  for  i=2,...,k,  we  can  establish  the  following 
lemma 


where , 


69 


q  (s 


(a).  =  n  (a-i(i-l)) 
p.  ;  _  i  2  P  ■ 


and 


(P1'P2 . V 


i  =  l 


(20) 


using  (14)  and  (19) ,  the  d  d  f  nf  e 

,  .  P-  .  .  of  s  ,  i.e.  the  largest  singular 

value  of  F  will  be  obtained  as  (21): 


g(s1) 


1 


r  ) 


”  kii  2 

_ _ k _  kn-1  \ '  ~2~  k  sx  p^-> 

2‘?*"1|Kcr/2lKRik/2  rk(n-+5+i)  1  P  0  ~  ^  j?7ntS±T) 


'2  P 


\  2  / 


CA(-^1)C,  (-R1) 


(21) 


CA(Ir 


be  proved  [1]  that  g(s1>  converges  and  is  upper  bounded  by  a 
well  defined  function  as 


F 


)  < 


kn 


'  — C 


I  n/2 


|K 


<k/: 


R 1 


r  (*±I) 

_  k  v  2  ’ 

r  (n-^:  tl  ) 


,kn-l 


[H 


lo2 


.-l, 


kn  ; 

,-e 


S. tr (K  ) 


lc2.  ,  -i  jS^tr(K~  '  )1 

2sltr(-c  )e  1  C 


(22) 


It  is  conjectured  that,  for  known  K  R  and  1C  expected  value  of  s,  can 
be  obtained  computationally  up  to  a  desired  degree  of  accuracy. 

E---—  of  the  Sum  of  Sguare  of  Singular  Values  of  F 

Using  (18)  and  some  careful  calculations,  we  can  obtain  the 
P.d.f.  of  the  sum  of  sguare  of  singular  values  of  F 
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I 


<3  (  s2) 


(L  s’,  2 

i  =  l  1 


_  j=i  y*  y  >  ^  — r  '  2  i 

,kn/2  ,kn.  ,  „  ,n/.!.  ,  k/2  ^  V  - I",  .  - - 


V^Vr1)  (4?1si>p 


T(^)|Kc|“/-iKR|1 


P=0 


C/V  P! 


Per ivat ion  of  Moments  Related  to  F  and  Singular  Values  of  F 
The  mean  value  of  FFT  is 


E{FF  }=  Kctr(KR) 


The  mean  value  of  the  sum  of  square  of  singular  values  is 


X\ 

E{S  '^}  =  tr(^C  0  V  =  tr^JtrtKn) 


For  n=K,  moments  of  products  of  the  singular  values  can  be  derived. 
Using  multivariate  gamma  function  and  applying  the  concept  on 
generalized  variance  introduced  in  [7,  pp.  171],  we  can  obtain  the  mth 
moment  of  the  product  of  the  singular  values  as 


E{(  n  s  )m} 
i=l  1 


km  m  m  _  ,k+m. 


rk(^ 


the  second  moment  or  variance  of  the  product  of  singular  values  will 
be 
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(27) 


E{  (.T^sJ2}  =|KC|  |KR|k! 

The  preceding  results  have  applications  in  mathematics,  control  and 
communication  sciences.  Their  application  in  image  processing  and 
pattern  recognition  has  been  explored  in  [1].  In  the  derivations,  F 
can  be  a  sample  window  from  a  Texture  field.  The  p.d.f.  of  F 
presented  in  this  report  results  in  the  p.d.f.  of  the  singular  value* 
of  F  which  ultimately  results  in  the  p.d.f.  of  the  largest  singular 
value.  For  known  and  KR,  and  from  p.d.f.  of  the  largest  singular 
value,  it  is  possible  to  computationally  obtain  the  expected  value  of 
the  largest  singular  value  of  F:  the  best  model  for  and  KR  is  a 
first  order  Markov  covariance  matrix  which  is  of  toeplitz  form.  The 
eigenvalues  of  a  toeplitz  form  can  be  derived  and  as  a  result  the 
zonal  polynomials  C^(K c  )  and  C^(KR  )  can  be  obtained  and  used  in 
equation  (21).  (21)  reaches  its  steady  state  value  rapidly  and  hence 
can  be  programmed  efficiently  on  a  computer.  The  expected  value  of 
the  largest  singular  value  provides  an  upper  bound  to  the  average 
variation  of  the  rest  of  the  singular  values.  The  above  computations 
can  be  used  in  designing  a  texture  classifier.  It  can  also  be  used  in 
Bhattacharyya  distance  analysis  of  texture  fields. 
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2.5  Models  for  Texture  Analysis  and  Synthesis 
David  Donovan  Garber 


1  nt  roduct ion 


Texture  is  an  important  character ist  ic  for  the  analysis  of 
images.  Accordingly,  the  study  of  this  image  feature  has  led  to  the 
proposal  of  models  attempting  to  describe  it  [1-41.  These  models  are 
for  the  most  part  ad  hoc  as  have  been  texture  discrimination 
techniques  [ 5 ] .  Almost  all  researches  have  made  no  more  than  a  slight 
attempt  to  apply  their  models  to  real  natural  textures. 

Presented  in  this  short  paper  are  the  results  of  independent 
texture  research  which  is  based  upon  an  extension  of  earlier  work  at 
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USC  [6]  showing  application  of  models  to  a  variety  of  real  natural 
textures.  Many  of  the  basic  ideas  derived  from  work  in  this  area  are 
similar  to  those  proposed  in  earlier  texture  studies  [4,7,8]. 

The  2-D  Binary  Markov  Model 

In  the  investigation  of  natural  phenomenon  once  a  researcher 
collects  enough  data  he  tries  to  imagine  a  process  which  accounts  for 
the  results.  The  construction  and  development  of  a  mathematical  model 
is  often  the  best  way  this  can  be  done.  In  some  cases  the  model  may 
be  extraordinarily  complex,  in  others,  exceedingly  simple.  Tn  most 
cases  model  acceptance  cannot  be  based  on  "truth"  as  the  true 
generating  phenomenon  is  too  complex  or  just  unknown  and  so  it  is 
based  upon  model  usefulness  and  "how  well  it  works."  Such  "working" 
models  for  texture  are  presented  here  because  they  have  the  ability  to 
simulate  natural  textures. 


If  one  can  synthesize  and  simulate  natural  textures  adequately  by 
using  some  proposed  model  the  criterion  of  usefulness  and  workability 
for  that  model  is  met.  A  researcher  may  then  apply  the  model  to 
problems  of  texture  identification  and  discrimination  with 
justification  in  a  non-ad  hoc  way  until  a  better  texture  model  is 
developed . 


Many  early  texture  studies  involved  the  use  of  binary  textures 
generated  by  one-dimensional  Markov  processes.  Such  work  was  used  by 
Purks  and  Richards  [9]  and  later  extended  by  Garber  [6].  In  these  one 
dimensional  models  a  large  vector  of  pixels  was  generated  line  by  line 
using  a  set  of  generation  parameters 


JV  ^1  '^2  1 
N+l 


'V 


where 


JV  ^ V 1 '  V2 '  ’  ‘  '  '  VN  ^ 

N  +  l 


P(VN+1/V1'V2‘ 


'V 


(1) 
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and  P ( A/B)  represents  the  probability  of  A  given  B.  Tn  the  above 
notation  each  represents  a  generated  pixel  which  has  value  0 
(black)  or  1  (white).  Each  pixel  value,  then,  depends  on  the  N  pixels 
previous  to  it.  A  two  dimensional  texture  image  was  then  formed  by 
breaking  up  the  large  vector  of  pixels  into  shorter  strings  and 
stacking  them  one  on  top  of  the  other.  This  procedure  for  large 
images  nearly  insured  image  row  independence  (unless  N  was  large) 
thereby  creating  only  horizontally  oriented  textures  totally 
unsuitable  for  simulating  natural  two-dimensional  textures. 

By  allowing  N  to  increase  exceeding  the  short  string  line  length, 
two-dimensional  (vertical  and  horizontal)  dependence  may  be  induced 
into  the  generating  process.  A  pixel  value  then  depends  not  only  on 
the  pixels  previous  to  it  on  the  same  line  but  also  on  the  pixels 
above  it  (see  Figure  1).  In  theory,  texture  dependence  could  be 
extended  ad  infinitum,  however  practical  considerations  concerning  the 
actual  generation  process  show  us  that  2  generation  parameters  must 
be  accounted  for.  As  a  possible  solution  to  the  storage  problem  we 
can  choose  to  ignore  all  but  N  of  the  previous  pixels  in  our 
generation  process  and  we  can  allow  the  pattern  of  the  N's  to  become 
flexible. 


Although  it  was  not  explicitly  stated  by  Garber  [61  the 
genera  ion  parameters  of  a  texture  may  be  estimated  for  any  given  set 
of  N  V  s.  These  statistics  have  the  property 


EIGV  (V  ,V . V)]=G  (V  ,V . V  ) 

N+l  N+l 


where 


G, 


y  (V1,V2,...,VN)  P ( v j , v2 , . . . , , V^+| ) / 

N  1 


(2) 


(PIV-.V, . VM,0)  +  P(V.,V, . V  I)) 


gaggnSBi 

stmt's 


Wm* 


>':'*< 


Figure  3.  Bark 


Figure  1 


Figure  2.  Grass 


One-dimensional  Synthesis 


Two-dimensional  Synthesis 


Ignoring  boundary  conditions,  the  best  linear  unbiased  estimates 
(P's)  of  the  P's  are  naturally  defined  by 

M  N 

P(V  ,V  ,...,V  )  }  £  11  6  ( I (K  + j )  -V  )  (3) 

j-0  K-- 1 

where  I(i)  represents  the  ith  element  of  the  texture  string  from  which 
the  parameters  are  to  be  estimated. 

By  using  this  method,  generation  parameters  were  estimated  from 
"parent"  binary  textures  and  used  to  synthesize  imitation  binary 

textures.  (The  V.'s  were  chosen  to  minimize  the  sum  of  squares  error 
assuming  a  linear  model  (see  next  section).)  These  (512)‘  parent 

textures  are  illustrated  in  Figures  2-11  (a).  Synthesized  binary 

(512)"  simulations  are  shown  in  Figures  2-11 (b). 

As  the  estimated  texture  generation  parameters  are  approximated 
using  statistics  gathered  from  the  full  parent  texture,  non-homogeni t y 
in  the  parent  texture  will  cause  an  "average"  texture  to  be 

synthesized.  Because  of  its  non-homogeneous  nature  (specifically  the 
directionality  of  the  stalks  in  different  parts  of  the  image)  and 
exhibition  of  detail  (specifically  individual  non-conforming  single 
stalks)  the  simulation  of  straw  will  not  be  perfect.  A  similar 
observation  may  be  made  with  respect  to  the  parent  textures  of  qrass 
and  water  but  in  these  textures  the  non-homogeni ty  >s  not  so 

pronounced.  As  we  are  attempting  to  synthesize  textures  and  not 

merely  "image  code"  the  parent  textures,  details  and  non-homogeneities 
will  be  lost  in  the  synthesis  process. 

Continuous  Tone  Linear  Texture  Model 


In  many  problems  a  grey-scale  image  is  first  converted  into  a 
binary  picture  by  quant i zot  ion .  As  was  seen  in  Figures  2-11,  much  of 
the  original  texture  information  is  retained  by  such  a  quantization. 


(a)  (b) 

Figure  6.  Leather 
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Still  a  model  for  continuous  tone  textures  is  desirable. 


As  in  the  binary  case,  the  texture  model  is  closely  related  to 

the  generation  process  which  successfully  simulates  it.  In  the  binary 

case  we  chose  a  Markov-N  model.  A  similar  model  can  be  chosen  for  a 

g-grey  level  image  however  the  retention  and  estimation  of  a  full  set 

N 

of  generation  parameters  in  this  case  is  impossible  as  it  reouires  g 
locations  (for  just  16  levels  with  N=8  this  is  over  4xl09  parameters) . 
After  much  thought  a  simple  linear  model  was  proposed.  This  model  has 
also  been  studied  by  Tou  and  Chang,  McCormick  and  Jaya r amur thy ,  and 
Deguchi  and  Morishita  but  their  results  did  not  definitively  indicate 
model  generality  [5].  The  model  may  be  expressed  as 


Y  =  X3  +  e 


where 


Y  =  Y 


N  +  l 


(4) 


and  B  is  an  (N+l)xl  vector  of  unobservable  parameters,  e  is  an  (N  +  l)xl 
unobservable  random  vector  such  that  E[e]=0,  0<V<g.  The  most  common 

,  ,  _  i  1 

estimator  of  8  is  (X'X)  X'Y,  the  least  squares  estimator.  So  given  a 

A 

parent  texture  an  estimation  of  B,  B,  may  be  obtained  yielding  the 
generation  model 


Y  =  XB 


(5) 


and  noise  can  be  added  to  our  generation  process  at  will.  After 

2 

carefully  choosing  X  and  estimating  B  for  each  512  parent  texture 
■  .  2 

shown  in  Figure  12-20(a)  512  synthesized  Monte  Carlo  simulat  ions  were 
generated  shown  in  figures  12-20(b). 
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Figure  13.  Straw 
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Figure  15.  Wool 


Conclusions 

By  expanding  the  above  models  and  experimenting  with  the  V  as 
well  as  the  noise  parameters,  improvements  in  the  appearance  of  the 
synthesized  textures  could  be  obtained  at  additional  compute  cost.  It 
would  probably  be  quite  a  challenge  to  develop  a  segmentor  powerful 
enough  to  separate  the  simulations  from  their  parents.  This  is 
illustrated  by  passing  a  l/4"-square  (55  pixels)  window  over  the 
texture  pairs  and  noting  the  similarities. 

Based  upon  the  results  of  this  work,  it  is  suggested  that  the 
auto  regression  and  Markov  models  should  be  exercised  more  closely 
with  respect  to  their  application  in  texture  analysis,  identification 
and  synthesis  in  future  study.  Such  work  and  more  details  of  the  work 
sketched  in  this  paper  will  appear  in  later  publications. 
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2.6  Structural  Object  Recognition  in  Aerial  Images 
K.  Ramesh  Babu 


In  this  report  we  describe  the  progress  in  trying  to  develop 
algorithms  to  recognize  objects  with  structural  descriptions  in  aerial 
images.  This  is  part  of  a  larger  effort  at  U.S.C.  to  produce  an  I.U. 
System  [5].  This  work  is  still  in  developmental  stages;  we  desribe 
the  method  in  terms  of  airport  recognition.  Based  on  the  model  of  a 
scene,  airports  are  approximately  located;  for  precise  location,  we 
need  to  perform  recognition  based  on  (a  bank  of)  airport  models. 
Previous  work  [2]  establishes  a  low-level  processing  technology  that 
we  propose  to  make  use  of  for  higher-level  processing.  We  have 
already  employed  the  linear  features  for  describing  roads  [31  and  we 
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address  ourselves  here  to  the  problem  of  object  recognition. 

The  R&cogn i t  ion  Process 

The  recognition  process  makes  use  of  models  of  the  objects  to  be 
recognised.  In  order  to  generate  constraints  that  will  aid  in 
recognition,  features  fjrom  _the  image  that  can  be  matched  with  various 
model  details  are  produced.  Then,  a  search  procedure  will  find 
possible  instances  of  the  model  in  the  image  by  trying  to  correspond 
each  individual  model  detail  with  image  features;  in  general,  one 
finds  more  than  one  correspondence  for  the  model  due  to  various 
factors  -  inadequacy  of  the  model  details  used,  symmetries,  and  noise 
in  the  detected  features.  Finally,  a  set  of  rules  will  often  be 
necessary  to  disambiguate  these  correspondences  and  pronounce  a  final 
match . 

The  next  sections  describe  the  details  of  modeling,  image  feature 
extraction,  and  matching. 

Model ing 

Our  model  employs  data  of  the  following  nature: 

1.  Runways  and  taxiways  are  specified  as  vectors  in  a  (rectangular) 
coordinate  system,  i.e.,  these  are  directed  straight  lines  with 
specific  position  and  length.  (Fig.  1). 

2.  Parallelism  relationships.  These  are  described  by  specifying  the 
two  runways  and  the  distance  of  separation  between  them. 
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type  runway  =  1 . . noof r unwa ys ; 
distance  =  real; 
angle  =  0..359; 
parallels  =  record 

rl ,  r2 :  runway; 
d:  distance; 
end ; 

3.  When  runways  meet  or  intersect,  they  can  do  so  in  various  ways. 
We  have  described  in  Figs.  2-4  the  more  commonly  encountered  type  of 
relations  (between  two  runways)  --  the  X-junction,  the  Y-junction  and 
the  L-junction.  As  will  be  explained  later,  in  the  section  on 
matching,  it  becomes  necessary  to  distinguish  two  types  of  Y-junctions 
and  four  types  of  X-junctions.  When  more  than  two  runways  participate 
in  the  intersection,  the  description  is  a  list  of  pairwise 
relationships.  An  example  will  serve  to  illustrate  these  points 
(Fig .  5)  . 

4.  Positional  relationships.  lj)rt  of  and  right  of.  A  runway  can  be 
to  the  left  of  another  or  to  the  right.  Since  runways  are  described 
as  vectors,  these  relations  can  be  unambiguously  defined. 

A  binary  intersection  relationship  is  described  by  the  type 
intersection : 


(a)  "bright  Y 


(b)  "dark  y" 


Fig.  3.  Distinctions  betweer 
the  Y-junction. 
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Model  detail  # 


Description 


rightturn 


airport  model  of  Fig.  1 


iiT? 

inttype  •  (Xjunction,  brighty,  darky,  leftturn,  converqe, 
diverge,  riqhtturn); 

ltypo  -  (leftturn,  converqe,  diverqe,  riqhtturn); 
intersect  ion  ■  record 

i : i nt  t  ype ; 
r 1 , r  2 : r  unway ; 
a  :  a  nq  1  e  ; 
end ; 


Two  po l n  t  s 

1 .  Anq les 

2 .  In  the 
the  sides 
t  he  r  unway 
the  minor 
r  unway . 


reqatdinq  these  descriptions  must  be  noted  here, 

ate  measured  counterclockwise  from  rl  to  r2. 

Y- junction  description,  the  runway  which  extends  on  both 
of  the  point  of  intersection  is  called  the  major  runway  and 
which  is  terminated  by  the  point  of  intersection  is  called 
runway.  rl  represents  the  major  runway  and  r2,  the  minor 


The  user  interface  tor  inputtinq  the  model  descriptions  is  still 
in  a  staqe  of  development.  At  the  time  of  writing,  the  user  inputs 
the  model  by  listinq  intersections  and  parallelisms  as  a  text  file 
accotdinq  to  a  certain  format  |b).  Eventually,  we  plan  to  automate 
the  model  detail  computation  -  intersect  ions ,  parallelisms  & 
positional  relationships  -  thus  requiring  that  the  user  only  input 
runways  and  taxi ways  as  vectors  in  a  coordinate  system. 

Image  Feat  ure  Description 

The  choice  of  i nt e r sec t ions  as  the  principal  model inq  unit  is 
motivated  by  the  relative  ease  with  which  similar  imaqe  descriptions 
can  be  built.  In  (31,  we  described  techniques  for  extracting  linear 
features.  Apars,  or  anti-parallel  line  segments  of  a  certain  width, 
can  be  used  as  descriptions  of  portions  of  runways.  At  points  in  the 


•\ 
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image  where  runways  intersect  or  meet,  it  is  possible  for  the  apars  to 
be  sharing  a  common  supe r segment .  A  supersegment  is  a  single, 
piecewise  linear,  curve  providing  continuity  of  edge  data  in  the 

image.  We  have  called  such  a  collection  of  apars  as  saps  121.  A  sap 
(short  for  super  ant i- pa r al 1  el )  is  a  collection  of  apars  with  a  common 
supersegment  [2]  providing  one  (or  both)  of  the  component  segments. 
Moreover,  the  apars  can  be  ordered  according  to  their  occurrence  in  a 
single  traversal  of  the  supe r segment . 

It  is  this  data  structure  of  sap  that  we  believe  is  also  useful 
in  providing  us  a  base  on  which  to  hypothesize  a  possible 
intersection.  In  Fig.  6,  for  example,  apars  1,  2,  3,  and  4  form  a 
sap.  It  is  reasonable  to  suppose  that  an  intersection  is  present  at 
the  meeting  point  of  the  axes  of  the  apars  2  and  3.  Although  the 
intersection  in  the  figure  is  described  as  an  x-junction,  the  internal 
description  can  make  a  further  distinction  because  apars  can  be 
considered  as  vectors.  Thus,  a  variable  of  type  1 t ype  is  used  to 
describe  the  directions  of  the  apars.  While  the  data  sufficient  to 
describe  the  intersection  are  the  two  apars,  for  gaining  ease  in 
subsequent  computing,  the  included  angle  between  them  and  their 

directions  are  both  noted.: 

type  image int  =  record 

t:  ltype; 
al ,  a2:  dapar ; 
a  :  angle; 
end ; 

The  type  dapar  serves  to  treat  apars  as  vectors.  Thus,  image 
descriptions  are  solely  made  up  of  a  list  of  4-tuples.  Further,  we 
order  the  apars  in  such  a  way  that  the  angle  is  measured  anticlockwise 
from  the  first  a par  in  the  tuple  to  the  second.  (At  the  time  of 

forming  these  descriptions,  it  is  useful  to  note  the  col  1  inear i t ies , 

viz,  apars  1  and  2  are  collinear  as  are  3  and  4). 
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The  Matching  Process 


We  have  viewed  the  matching  process  as  the  problem  of  relating 
each  model  runway  to  a  set  of  col  linear  apars.  In  the  algorithms, 
however,  a  set  of  col  linear  apars  can  be  identified  by  a 
representative  apat  ot  the  set  and  thus,  the  matching  problem  becomes 
a  labeling  problem  [1,4]. 

The  Label  ing  Problem.  Given  a  set  of  units  U  =  lu  j ,  u  ,,...,  u  1 ,  a 
set  of  labels  L=  • .4  j ,  0  }  and  a  set  of  constraints  as  elements  of 

'l  . 

L'xL  ,  1  =  1, 2,  .  .  .  ,m  find  elements  in  itlxL)  consistent  with  the 
constraints.  These  elements  are  called  consistent  labelings  [4]  or 
global  constraints  [1],  A  labeling  is  a  mapping  of  all  the  units  into 
single  labels. 

In  11],  Shapiro  and  Haralick  attempt  to  produce  a  labeling  which 
maps  each  unit  to  a  label,  where  any  subset  of  units  and  their  labels 
is  consistent  with  some  constraint (s)  in  {UxL}1,  i=l,2,...,m.  On  the 
other  hand,  Freuder  [4]  considers  the  problem  as  one  of  producing  or 
ng  a  constraint  involving  all  the  units  and  their  labels. 
These  two  approaches  result  in  different  algorithms:  |1]  merely 
reduces  the  amount  of  search  in  an  otherwise  straight  forward  search 
while  l 4 |  systematically  builds  constraints  of  the  next  higher  order 
until,  finally,  a  global  constraint,  or  a  constraint  where  all  the 
units  are  labeled,  is  found. 

In  our  case,  the  units  arc  runways,  the  labels  are  directed  apars 
and  the  constraints  are  the  primitive  1  abe lings.  A  primitive  labeling 
is  any  independent,  constraint  on  a  (set  of]  model  runway(s)  .  Tn  our 
problem,  however,  we  do  not  use  unary  constraints  (involving  only  one 
runway)  ,  because  these  unary  constraints  --  viz.,  length,  width, 
brightness  --  do  not  seem  to  provide  useful  discrimination.  Thus,  all 
we  have  as  source  of  independent  constraints  are  intersections, 
parallelisms  and  positional  relationships,  and  these  can  only  form 


binary  constraints  (Observe  that  we  represent  model  intersections 
involving  more  than  two  runways  as  a  set  of  model  intersections 
involving  exactly  two  runways).  A  match  between  a  model  intersection 
and  an  image  intersection  is  achieved  when,  in  an  imaginary 
positioning  of  the  image  intersection  on  the  model  intersection  with 
one  point  of  intersection  on  top  of  the  other,  the  runways  and  apars 
are  aligned.  It  is  clear  that  the  included  angle  of  the  image 
intersection  will  be  appr ox imat e 1 y  the  same  as  that  of  the  model 
intersection.  At  this  point,  the  runways  and  the  cor  respond inq  apars 
may  be  of  the  same  orientation  or  of  opposite  orientation.  Thus,  in  a 

described  below,  rl  matches  with  al  and  r2  matches 
with  a2.  Again,  the  t ype  dapar  serves  to  preserve  the  orientation 
information  about  the  apars.: 


type  primitive  labeling 


r  ecord 

r 1  ,  r  2  :  r  unwa  y ; 
al,a2:  dapar; 
end ; 


The  distinctions  mentioned  earlier  amonq  the  various  categories 
of  L-junctions  and  Y-junctions  are  valuable  in  the  formation  of 
primitive  labelings.  In  general,  an  image  intersection  can  match  an 
X-type  intersection  in  any  one  of  the  four  angular  positions.  With  a 
Y-type  model  intersection,  however,  there  are  only  two  different 
matches  possible.  Finally,  with  the  L-type  model  intersection  there 
can  be  only  one  match.  Thus,  according  to  whether  the  model 
intersection  is  X-type,  Y-type,  or  L-type,  a  maximum  of  4,  2,  or  1 
primitive  labelings  are  formed,  respectively,  for  each  matchinq 
image-model  intersection  pair.  In  all  these  cases,  the  apars  may 
match  the  runways  as  they  are,  or  with  their  directions  reversed. 


The  output  of  our  matching  is  a  set  of  m^x^mal^  label  ings.  A 
maxima  1  label ing  is  a  mapping  of  units  into  labels  where  not  all  units 
will  be  mapped;  morever,  it  is  not  possible  to  to  find  another 
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labeling  which  is  such  that  --  the  set  of  units  having  labels  in  this 
labeling  is  a  super  set  of  that  of  the  maximal  labeling  and  the 
corresponding  labels  are  equivalent.  The  equivalence,  in  our  case, 
translates  to  col  1  inear ity. 

With  the  problem  defined  as  above,  we  note  here  the  differences 
between  our  algorithm  and  those  in  [1]  and  [4): 

1.  In  (11,  the  aim  is  to  generate  a  labeling  which  labels  a21 
the  units;  we  cannot  afford  to  take  such  a  viewpoint  because  of  the 
presence  of  noise  in  real  images. 

2.  Both  in  [1]  and  (41,  any  constraint  not  compatible  with  any 
primitive  or  partial  labeling  is  removed  from  further  consideration; 
again,  we  cannot  afford  to  remove  constraints  because  we  are  never 
sure  that  a  partial  labeling  is  "more  correct"  than  the  primitive 
1  abe  ling. 


Our  search  procedure  is  designed  to  incorporate  the 
abovement ioned  differences,  and  proceeds  according  to  the  tree  of 
Fig.  7.  Each  node  of  the  tree  records  information  about  the  (partial) 
labeling  formed  upto  that  point  in  the  search  and  a  set  of  runways 
responsible  for  making  the  labeling.  (These  runways  are  sometimes 
referred  to  as  "runways  of  a  node”  in  the  following  text). 
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Search  tree 


type 

1 abel ing 

=  record 

a:  array [ runway]  of  dapar; 
end ; 

node  = 

record 

1  :  1 abel ing ; 

rused:  set  of  runway; 

end ; 

In  the  search  tree  of  Fig.  7,  we  have  considered  a  hypothetical  case 
where  each  runway  forms  exactly  two  primitive  labelings.  The  numbers 
alongside  each  branch  refer  to  the  runway  that  is  used  in  choosing  a 
constraint  for  continuing  the  search  along  that  branch.  While,  in 
general,  this  constraint  can  involve  one  or  more  runways,  our  problem 
does  not  contain  unary  constraints.  These  constraints  are  nothing  but 
the  primitive  labelings  of  a  model  intersection  involving  the  runway 
in  question.  These  primitive  labelings  are  each  responsible  for  the 
formation  of  a  successor  node. 

The  initial  node  S  will  have  an  "empty"  labeling  and  an  empty  set 
of  runways.  Search  proceeds  by  considering  any  runway  that  is  not  in 
the  node;  we  consider  all  the  model  intersections  formed  by  that 
runway  and  form  primitive  labelings;  each  one  of  the  primitive 
labelings  will  enable  us  to  form  a  successor  node.  Observe  that  the 
search  tree  will  not  be  balanced. 

The  formation  of  a  successor  node  involves  (i)  assigning  a  new 
(partial)  labeling  and  (ii)  adding  the  runway  responsible  for  the 
formation  into  the  set  of  runways  of  the  node.  The  successor  node  is 
formed  in  such  a  way  that  - 

1.  All  the  runways  that  are  labeled  in  either  the  labeling  of 
the  parent  node  or  the  primitive  labeling  are  labeled  in  the  labeling 
of  the  successor  node  and 
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2.  If  a  runway  is  labeled  both  in  the  label inq  of  the  parent 
node  and  in  the  primitive  labeling,  then  the  labels,  i.e.,  apars,  must 
either  be  same  or  coll  inear;  further  in  the  new  label inq,  the  runway 
gets  labeled  as  one  of  the  apars,  which  is  thereafter  considered  the 
representative  of  the  collinear  group  of  apars  labelinq  the  runway. 

3.  The  runways  of  the  successor  node  =  union  of  runways  of  the 
parent  node  and  the  runway  responsible  for  expansion. 


Note  also  that  the  parent  node  should  also  be  considered  for 
further  search  but  with  a  specific  change;  the  runway  responsible  for 
the  current  expansion  of  the  node  should  be  included  in  its  set  of 
runways.  The  successor  nodes  are  tested  aqainst  a  global  criterion 
and  discarded  if  they  fail  to  satisfy  it.  Currently,  as  implemented 
now,  the  global  criterion  is  relative  angular  disposition  of  the 
r  unways . 

When  a  node  can  no  longer  be  expanded,  it  represents  a  maximal 
labeling  for  our  problem.  However  this  labelinq  may  not  completely 
characterise  the  runways  because  all  the  runways  might  not  be  labeled 
and  further,  the  labeled  runways  might  not  be  completely  described  by 
the  labeling  apars.  A  further  search  is  required  to  complete  the 
runway  descriptions.  At  the  time  of  writing,  this  secondary  search  is 
not  implemented. 

?2D£ly^2D9  R£!!'2I^2 

The  matchinq  procedure  we  have  described  essentially  exploits 
only  connectivity  relationships  and  hence,  in  qeneral  ,  we  should 
expect  to  find  many  more  labelings  than  the  correct  one,  if  it  exists. 
Noise  and  symmetries  also  add  to  this  number.  We  plan  to  utilise 
positional  relationships  in  successfully  filtering  out  some  globally 
unwanted  labelings.  The  effectiveness  of  the  procedures  desribed  here 
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need  to  be  evaluated  on  real  images. 
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2.7  Model  to  Image  Matching-Continued 


Keith  E .  Pr ice 


This  note  reports  on  the  current  status  in  the  development  of  a 
system  for  matching  models  of  a  scene  with  images  of  the  scene.  More 
detailed  discussions  and  references  to  other  work  have  appeared  in 
previous  semi-annual  reports  and  a  complete  description  will  appear  in 
a  forthcoming,  separate  report. 

The  system  uses  both  region  based  and  edge  based  segmentation 
techniques.  The  edge  based  method  is  used  to  find  prominent  linear 
objects  (especially  roads) ,  and  the  region  based  method  extracts  the 
other  objects  which  are  easily  described  as  connected  regions.  These 
segments,  lines  and  regions,  are  used  as  the  basic  elements  in  the 
symbolic  description  of  the  image,  which  is  completed  by  extracting 
various  features  of  the  segments.  The  user  model,  which  internally  is 
the  same  form  as  the  image  description,  is  derived  through  a  dialog 
between  the  program  and  the  user.  The  features  used  relative 
positions,  etc. 

The  model  is  matched  with  the  image  to  determine  which  image 
segment  corresponds  with  each  model  element.  At  this  point  several 
improvements  have  been  made.  In  aerial  images  the  linear  feature 
extraction  system  tends  to  break  objects  into  several  pieces.  Figure 
1  is  an  aerial  view  of  the  Stockton,  California  area  with  a  major 
highway  running  from  the  top  to  the  bottom.  Figure  2  shows  the  major 
lineat  features  extracted  from  this  image.  This  major  road  has  been 
broken  into  many  small  pieces,  with  several  large  gaps.  Also,  many  of 
the  portions  which  appear  to  be  connected  are  actually,  in  the  current 
linear  feature  description,  several  unconnected  elements.  Now  the 
system  allows  for  multiple  matches  with  a  model  element.  As  always, 
whether  a  model  element  is  used  once  or  more  than  once  depends  on  the 


Figure  3. 


Figure  4. 


Recognition  ot  Segments  in  an  Aerial  View  of  San  Diego 


Recognition  of  Region  and  Linear  Segments  in  Stockton. 
This  shows  the  capability  of  the  system  to  locate  multiple 
pieces  of  a  given  object.  Note  that  there  are  several 
pairs  of  channels  and  roads  with  overlapping  labels 
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task  and  model  descriptions.  Figure  3  shows  how  this  can  be  used  to 
identify  multiple  portions  of  the  road  segments,  with  multiple 
segmentations  being  given  the  same  name.  Also  some  of  the  narrow 
river  channels  have  been  located  along  with  the  adjacent  road,  their 
names  overlap  because  of  the  automatic  label  position  method.  Figure 
4  shows  another  example  of  multiple  segments  for  a  road  for  a 
different  scene. 


2.8  Comparison  Between  MAP  and  LMMSE  Filters  for  the  Restoration  of 
Images  with  Poisson  Noise 

C.M.  Lo  and  A. A.  Sawchuk 


In  several  previous  reports,  mathematical  models  for  images 
degraded  by  Poisson  noise  have  been  presented  and  a  maximum  a 
posteriori  (MAP)  nonlinear  filter  for  image  restoration  has  been 
derived  (1,2).  In  this  report,  we  make  a  comparison  between  the 
linear  minimum-mean-square  error  (LMMSE  or  Wiener)  filter  and  the  MAP 
filter  for  restoring  images  with  Poisson  noise.  The  Wiener  filter  is 
much  easier  to  implement  although  it  is  only  a  linear  processor  which 
is  not  expected  to  give  as  good  a  result  as  the  nonlinear  MAP  filter. 

Structure  of  the  LMMSE  Restoration  Filter 

From  estimation  theory  [ 3 ]  —  ( 9 ]  the  LMMSE  filter  transfer  function 
W(u,v)  is  derived  by  minimizing  the  expectation  of  the  squared  error 
between  the  original  object  and  the  estimated  object.  Use  of  this 
criterion  or  the  equivalent  orthogonality  principle  [5]-[8]  yields  the 
transfer  function  of  the  LMMSE  filter  as 
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Wp (u  f  v) 


(1) 


(u ,  v) 
fu~,  v) 


where  is  the  cross-spectral  density  of  the  detected  image  g(x,y) 
and  the  object  f(x,y)  while  4*  is  the  spectral  density  of  the 
detected  image  g(x,y).  From  a  straight  forward  substitution  into 
Eq  •  ( 1 )  [101,  we  have 


nV* (u,v) $  r (u,v) 

W  (u,v)  =  — - - j - 

P  1+N|  |V(u,v)  |  |  4>f  (u,v) 


(2) 


where  N  is  the  mean  number  of  photon  counts  in  the  detected  image, 
Mu,v)  is  the  Fourier  transform  of  the  degrading  point-spread  function 
( PSF )  h(x,y)  and  ♦  (u,v)  is  the  spectral  density  of  the  object.  The 
detailed  derivation  of  Eq .  (2)  is  described  in  [10].  For  a  linear 
Gaussian  additive  noise  model,  the  Wiener  filter  is  [ 3 ]  —  [ 9 ] 


&'*(u,v)4>(u,v) 

W  (u,v)  = - - - 

|  |V(u,v)  I  I  (u,v)  +4>n  (u,v) 


(3) 


where  *n(u,v)  is  the  spectral  density  of  the  noise  which  is 
statistically  independent  of  the  signal,  <t>f(u,v)  is  the  spectral 
density  of  the  object  and  Mu,v)  is  the  Fourier  transform  of  h(x,y). 
Rewriting  Eg.  (2),  the  Wiener  filter  takes  the  familiar  form 


where 


W  ( u ,  v ) 


_  V  *  ( u ,  v ) 

|  |V  ( u ,  v )  |'|5  +  | 


(4) 


n 


( u ,  v ) 

(u,  V) 
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is  called  the  siqnal-to-noise  (SNR)  for  the  linear  additive  Gaussian 
no i se  model . 


Referring  to  Eq.  (2)  for  the  Poisson  noise  model,  the  LMMSF 
filter  can  be  written 


W 

P 


V  U  ,  v ) 


&  M  u  ,  v  ) 

|  |V(U,V)  I  |24 


1 


(5) 


where  N>t^(u,v).  Thus  the  function  W  (u,v)  is  the  same  as  Ww(u,v) 
except  that  a  is  defined  differently  from  S  .  although  Poisson  noise 
and  its  SNR  arc  signal -dependent ,  a  can  be  called  the  equivalent 
siqnal-to-noise  ratio  (SNR)  .  When  the  value  of  is  very  large, 
which  is  the  case  when  the  rate  function  is  hiqh,  then  Fq .  (5)  becomes 

W  (u  ,  v)  -v  ^  '  (n  ,  v)  ( 6 ) 
P 

and  the  LMMSF  filter  approaches  the  inverse  filter  in  the  absence  of 
Poisson  noise.  Indeed,  the  larger  the  rate  function,  the  lesser  is 
the  degradation  due  to  Poisson  noise.  In  this  case,  the  I.MMSF  filter 
only  needs  to  remove  the  blurring  degradation  effects.  As  discussed 
previously  [1],  |2),  Poisson  noise  effects  are  much  more  pronounced  at 
low  light  levels  when  the  value  of  a  is  smaller.  In  this  case,  the 
LMMSF  filter  w  is  dominated  by  Poisson  noise  and  image  signals  will 
be  seriously  distorted.  Thus  the  performance  of  the  LMMSF  filter  will 
be  expected  to  be  worse  at  lower  equivalent  SNR’s.  Although  the  LMMSF 
filter  is  based  upon  the  minimum  mean-square  error  criterion,  this 
estimation  error  is  a  minimum  under  the  condition  that  the  a  priori 
knowledge  is  perfect.  Functions  such  as  the  spectral  density  of  the 
object  ijr  and  the  mean  number  of  photon  counters  N  must  be  perfectly 


known*  In  reality,  and  N  are  never  perfectly  known  and  must  be 

estimated  from  the  observation  [11].  Hence,  the  actual  LMMSE  error 
does  not  reach  the  minimum.  In  short,  the  LMMSE  filter  tries  to  force 
the  solution  toward  the  inverse  solution  with  some  sort  of  smoothness 
controlled  by  the  equivalent  SNR 

Structure  of  the  MAP  Restoration  FUter 

It  has  been  noted  earlier  [1],  [21  that  the  fundamental  MAP 

estimate  contains  maximum  likelihood  (ML)  and  a  p£pp£i  terms  in  its 
solution.  The  MAP  estimate  can  be  written  in  the  equivalent  form 

f  =  f  +  X  R  .11T  (q-1 )  i-i) 

where  F  is  the  nonstationary  mean  of  the  object  estimated  from  the 
noisy  data.  The  physical  interpretat ion  of  the  MAP  estimate  is  that 
maximizing  the  probability  p(d  f)  forces  the  solution  toward  the 
inverse  solution  which  is  the  maximum  likelihood  solution,  while 
maximizing  the  probability  p(f)  is  equivalent  to  enforcing  a 
smoothness  criterion.  Thus,  the  MAP  filter  tries  to  balance  the 
inverse  solution  with  a  smoothness  constraint  [121.  Another  physical 
interpretation  from  Eq .  (7)  is  that  the  MAP  estimate  tries  to  move  th<; 
solution  of  the  estimate  f  from  the  a  pr£0£ji  nonst  at  i  ona  r  y  mean  f 

to  a  maximum  likelihood  solution  f  . 

-  ML 

BfsuJ.£s 

Our  experimental  implementation  of  the  LMMSE  filter  is  based  on 
Eq .  (5)  using  a  fast  Fourier  transform  algorithm.  The  ensemble  mean 
photon  counts  N  and  the  object  spectral  density  4' f  are  estimated  from 
observed  photon  count  data.  The  estimate  of  4*  f  can  be  made  by 
substituting  a  similar  "prototype"  spectral  density  suggested  by 
Cannon  in  the  blind  deconvolution  process  [131,  or  estimated  by  an 
iteration  method  suggested  by  Lim  in  a  method  of  image  restoration 
called  spectral  subtraction  (SSIR)  [111.  unfortunately,  they  assume 
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the  noise  is  linear  s lg na 1 - i ndependc nt  additive.  Because  the  Poisson 
noise  is  signal -dependent ,  the  estimation  of  <}> f  is  different. 

The  spectral  density  of  the  object  is  related  to  the  spectral 
density  of  the  detected  image  by 

*  (u,v)  N+(N)2! |V(u,v) | |2*f (u,v)  (8) 

where  N  is  ensemble  mean  number  of  photon  counts  and  «.  (u,v)  is  the 
Fourier  transform  of  the  PSF  h(x,y).  Thus,  the  spectral  density  of 
the  object  *  .  can  be  estimated  by  an  iterative  method,  although  this 
involves  an  inverse  filter  with  *■  .  We  do  not  investigate  the 

estimation  of  the  spectral  density  4^.  Instead  we  assume  that  4' ^  is  a 
white  spectral  density.  The  white  object  spectral  density  assumption 
extracts  more  higher  spatial  frequency  content  of  many  images  falls 
rapidly  at  high  spatial  frequencies. 

For  generality,  a  two-dimensional  moving  average  blurring  point 
spread  function  (PSF)  is  chosen  for  the  simulation  rather  than  a 
Gaussian  blurring  PSF  because  it  has  singularities  and  phase  reversals 
in  the  frequency  response.  Restored  images  with  the  LMMSE  filter  for 
different  equivalent  SNR’s  are  illustrated  in  Fig.  1.  The  amount  of 
restoration  is  controlled  by  the  &.  The  serious  effects  of 

i 1 1 -cond i t i on i ng  can  be  seen  in  Fig.  1.  The  restored  image  gradually 
blows  up  as  t  goes  to  higher  values, while  it  becomes  more  noisy  as  a 
goes  to  lower  values.  because  the  Poisson  noise  degradation  is  very 
pronounced  for  small  values  of  a  and  the  ill-conditioning  of  the 
inverse  filter  is  worse  at  larger  values  of  a,  implementation  of  the 

a 

LMMSE  filter  is  very  sensitive  to  the  equivalent  SNR  a. 

Figures  2  and  3  show  results  with  the  nonlinear  sectioned  MAP 
filter  implemented  with  Newt on-Raphson  iterative  techniques  as  before. 
Figure  2  has  results  with  no  blur,  and  Fig.  3  shows  results  with 

linear  blur.  The  experiments  show  that  the  MAP  estimate  is  superior 

at  all  signal-to-noise  ratios  tested. 


Restored  images  with  the  LMMSE  filter  for  different 
estimated  (SNR)  &  and  3x3  moving  window  blur. 


Restored  image  with  the  LMMSE  filter  for  ct=10 

Restored  image  with  the  LMMSE  filter  for  d=20 

Restored  image  with  the  LMMSE  filter  for  6t= 4 0 

Restored  image  with  the  LMMSE  filter  for  a=100 
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Figure  2a.  Images  restored  by  the  LMMSE  filter  and  the  MAP  filter  at 
(SNR) rms  =  >  2  .3  with  no  blurring 

(A)  Original  object  imaae 

(B)  Poisson  noisy  imaae 

(C)  Image  restored  by  the  LMMSE  filter 

(D)  Image  restored  by  the  MAP  filter 
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Figure  2b 


Images  restored  by  the  LMMSE  filter  and  the  MAF  filter  at 
5  with  no  blurrina 


rms 


Original  obiect  inane 
Poisson  noisy  image 

Restored  imacte  by  the  LMMSE  filter 
Restored  imaae  b\'  the  MAP  filter 


Figure  3a.  Images  restored  by  the  l.MMSE  filter  and  the  MAP  filter 

with  two-dimensional  1x3  moving  window  blurring  degradation 

at  (SNR)  =/2.5 
rms 

(A)  Original  obiect  image 

(B)  Degraded  image 

(C)  Restored  image  bv  the  LMMSE  filter 

(D)  Restored  image  by  the  MAP  filter 
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The  LMMSE  filter  and  the  MAP  filter  are  performing  similar 
functions  of  balancing  the  inverse  solution  with  a  smoothness 
constraint.  The  LMMSE  filter  uses  the  equivalent  SNR  to  control  the 
balancing,  while  the  MAP  filter  uses  the  covariance  matrix  R  of  the 
object  as  a  measure  of  the  confidence  in  the  nonstationary  mean  f  and 
the  maximum  likelihood  solution  j:ML  as  a  restoration  solution.  The 
LMMSE  filter  is  based  on  the  assumption  that  the  detected  image 
intensity  can  be  approximately  modeled  by  a  stationary  random  field. 
For  typical  images  however,  each  part  of  the  image  generally  differs 
sufficiently  from  other  parts  so  that  the  stationarity  is  not 
generally  valid.  Moreover,  the  LMMSE  filtering  process  is  insensitive 
to  abrupt  changes  of  image  signals.  This  results  in  edge  smoothing 
and  contrast  reduction.  The  MAP  filter  does  consider  the 
nonstationary  properties  of  random  image  fields  because  it  contains  an 
ML  term  and  an  a  priori  term.  The  MAP  filter  theoretically  needs  more 
s  priori  knowledge,  but  in  actual  implementation,  the  MAP  filter  uses 
less  a  priori  knowledge  than  the  LMMSE  filter.  Both  filters  require 
knowledge  of  the  blurring  matrix  H  and  the  mean  number  requires  the 
spectral  density  of  the  object  It  can  be  concluded  that  the  MAP 
filter  should  perform  better  then  the  LMMSE  filter  for  the  Poisson 
noise  model  . 

Cone  1  us  ions 

A  comparison  has  been  made  between  the  LMMSE  filter  and  the  MAP 
filter  with  the  Poisson  noise  model.  It  has  been  shown  that  the 
quality  of  the  restored  image  of  the  MAP  filter  is  sunerior  to  that  of 
the  LMMSE  filter  by  simple  subjective  evaluation.  Boulter  [14]  has 
shown  that  even  with  large  amounts  of  blurring  present  in  a  detected 
image,  a  low  noise  level  permits  almost  complete  restoration. 
However,  photon  resolved  image  signals  have  a  very  low  signal-to-noise 
ration,  making  it  impossible  to  obtain  perfect  restoration  for  image 
signals  suffering  from  both  blurring  and  Poisson  noise. 
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ABSTRACT: 

This  paper  describes  our  on-going  program  to  develop  special-purpose 
charge-coupled  device  and  metal  oxide  semiconductor  integrated  circuits 
for  real-time  image  processing.  This  work  has  emphasized  the  development 
of  circuits  that  will  perform  the  front-end, or  "low-level, "  processing 
functions  at  data  rates  in  excess  of  10^  pixels/sec.  We  describe  the 
design  and  fabrication  of  a  third  test  chip  which  will  perform  two- 
dimensional  processing  operations  over  kernel  sizes  ranging  from  3x3  to 
26x26  pixels.  Included  on  this  chip  are  data  programmable  operations 
for  processing  over  a  5x5  kernel  at  real-time  television  rates.  In 
addition,  we  describe  the  test  facilities  we  have  designed  and  built  to 
demonstrate  the  performance  of  these  circuits  and  the  initial  test 
results . 


II 


INTRODUCTION 


I . 

A  primary  aim  of  the  program  has  been  to  demonstrate  the  feasibility  of 
performing  image-understanding  algorithms  in  real  time.  For  our  purposes, 
we  define  "real  time"  to  be  equivalent  to  high-quality  television,  7.5-MHz 
data  rate.  Even  for  relatively  simple  operations  on  kernels  of  3x3  or 
5x5  pixels,  this  represents  a  speed  increase  over  conventional  general- 
purpose  computers  of  at  least  two  or  three  orders  of  magnitude.  To 
achieve  this  increased  thoughput,  we  have  designed  and  implemented  novel 
charge-coupled  device  (CCD)  and  metal  oxide  semiconductor  (MOS)  processing 
architectures  that  can  be  integrated  into  infrared  and  video  cameras. 

Based  on  the  work  funded  on  this  program,  we  are  currently  investigating 
several  military  applications  that  require  both  the  sensor  and  processor 
to  be  integrated  onto  a  single  chip  (the  so-called  "smart -sensor " 
phi losophy) . 

We  have  designed,  built,  and  tested  three  integrated-circuit  test 
chips  containing  the  14  algorithms  listed  in  Table  1.  Each  chip  lias 
been  used  to  demonstrate  a  different  approach  to  image  analysis.  The 
first  chip  shown  in  Figure  1  was  aimed  at  demonstrating  a  novel  two- 
dimensional  CCD  filtering  approach  which  allows  concatenation  of  several 
(in  this  case  5)  image-understanding  operations  and  has  been  designed  to 
be  integrated  into  the  sensor  directly  at  the  focal  plane.  Each  of  the 
functions  on  this  chip  operates  over  a  5x3  array  of  picture  elements  and 
provides  a  single  processed  picture  element  for  each  new  input  pixel. 

The  circuit  accepts  three  lines  of  video  data  equivalent  to  the  3x3 
array  and  as  such  requires  two  external  analog  delay  lines  when  operated 
from  a  vidicon  or  commercial  camera, as  shown  in  Figure  2.  In  our  initial 
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Table  1.  Integrated  Circuit  Primitives  Developed  to  Date 


7819-10  R 1 


OUTPUT 

CIRCUITRY 


Figure  1.  Photomicrograph  of  test  chip  I. 


Figure  2.  Formation  of  the  3x3  pixel  array  using 
external  analog  delay  lines. 
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work  to  operate  these  circuits  in  real  time,  we  have  used  Fairchild 
CCD  321  analog  line  delays.  These  have  worked  quite  well  but  limit 
both  the  dynamic  range  and  the  signal-to-noise  ratio  of  the  processor. 

We  are  currently  investigating  techniques  for  incorporating  the  circuits 
directly  into  a  CCD  imager  as  shown  in  Figure  3. 

The  second  chip,  shown  in  Figure  4,  contains  five  individual  circuits 
again  using  a  3x3  pixel  kernel  and  is  aimed  at  demonstrating  adaptive 
processing  using  the  local  mean  as  the  control.  This  circuit  has  been 
operated  directly  from  both  a  vidicon  and  CCD  camera  with  an  overall 
processing  accuracy  equivalent  to  4  hits.  The  operations  performed  in 
addition  to  the  3x3  average  are  adaptive  stretch,  binarization  based  on 
the  local  mean,  unsharp  masking,  and  Sobel  edge  detection.  Under  a 
parallel  contract  with  Night  Vision  Laboratories,  Fort  Belvoir,  Virginia, 
we  have  integrated  these  circuits  into  a  demonstration  processor,  shown 
in  Figure  5.  At  the  request  of  the  customer,  we  incorporated  a  CCD  field 
delay  to  remove  the  interlace  and  provide  a  processing  capability  on 
adjacent  lines  of  video.  This  processor  has  been  operated  at  a  4-MHz 
clock  rate,  and  the  results  are  reported  in  Ref  1. 

Our  recent  work  has  been  concerned  with  the  design,  processing,  and 
initial  evaluation  of  a  third  test  chip,  which  is  aimed  at  demonstrating 
processing  techniques  using  larger  kernel  sizes  (as  high  as  26x26  pixels) 
and  demonstrating  a  programmable  capability.  After  many  problems  and 
delays  in  obtaining  a  satisfactory  mask  set,  we  have  now  completed  the 
processing  of  this  chip  and  are  currently  collecting  preliminary  per¬ 
formance  data  on  each  of  the  five  circuits,  as  descirbed  below. 
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Figure  3.  Technique  for  integrated  CCD 
images  and  processor. 
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II.  PROGRESS  ON  TEST  CHIP  III 

The  principal  effort  this  period  has  been  with  the  design,  simulation, 
and  processing  of  this  chip.  Five  functions,  a  7x7  mask  programmable  array, 

a  3x3  Laplacian,  a  median  operator,  a  5x5  voltage  programmable  convolution, 

2 

and  a  large  26x26  element  convolution  for  the  primal  sketch,^  are  included. 
The  design  goal  is  for  a  15-MHz  clock  rate  and  an  overall  processing 
accuracy  equivalent  to  6  bits.  The  resolution  of  the  circuit  lithography 
is  5  to  7  pm,  equivalent  to  commercial  optical  techniques,  and  the  technology 

is  n-type  surface  channel.  The  bandwidth  requirements  for  this  chip  are 

/ 

towards  the  high  end  of  the  speed  capability  range  for  surface  channel 
devices  and  hence  represent  a  considerable  challenge.  Also,  the  kernel 
size  has  been  considerably  extended  from  the  9  pixels  used  in  our  previous 
work.  The  largest  processor,  the  convolution  for  the  primal  sketch, 
contains  338  pixels.  Further,  the  dynamic  range  required  by  the  operators 
contained  on  the  chip  is  much  increased,  representing  approximately 
8  bits,  and  we  are  including  special  techniques  to  achieve  this. 

Probably  the  most  significant  challenge  we  are  addressing  on  this  chip  is 
the  development  of  programmable  processing  kernels.  The  concept  here  is 
to  develop  a  general-purpose  convolutional  processor  that  can  accept 
data  at  real-time  video  rates  and  can  adapt  its  kernel  size  and  weights 
either  in  a  preprogrammed  way  or  in  response  to  the  processed  output  at 
a  speed  higher  than  the  frame  rate  (>30  Hz).  If  such  a  device  can  be 
developed  with  accuracy  equivalent  to  6  bits,  it  will  find  very  wide¬ 
spread  general  utility  in  image  analysis  and  understanding.  At  present, 
the  kernel  size  for  this  circuit  is  5x5,  but  there  is  no  fundamental 
limit  preventing  this  from  being  significantly  increased.  To  meet  the 


significantly  increased  demands  both  in  terms  of  speed  and  dynamic 
range,  we  have  included  an  on-chip  sample  and  hold  to  both  reduce  the 
output  noise  and  lower  the  clock  feedthrough.  This  should  significantly 
increase  the  performance  of  the  functions,  particularly  at  high  speed. 

A  schematic  of  the  circuit  is  shown  in  Figure  6.  This  device  has  been 
simulated  to  operate  at  an  11-MHz  data  rate  and  provide  a  60-dB  common 
mode  rejection  driving  a  30-pF  output  load.  This  circuit  is  included 
in  each  of  the  CCD  functions. 

A  photomicrograph  of  the  full  chip  showing  each  circuit  and  the 

test  devices  is  shown  in  Figure  7.  The  chip  itself  is  approximately 

2  2 
225  mils  ,  which  is  slightly  larger  than  our  previous  one  (191  mils  ). 

This  has  resulted  in  fewer  dice  per  wafer,  thus  requiring  higher  yield 

to  provide  acceptable  quantitites  for  testing.  We  are  currently 

processing  11  wafers,  each  with  36  dice/wafer.  This  should  hopefully 

result  in  enough  acceptable  circuits  for  initial  testing.  Later  we  will 

process  an  additional  lot  when  the  chip 's initial  operating  parameters 

have  been  determined. 

This  chip  has  now  been  designed  and  simulated  using  the  circuit 
analysis  and  simulation  program  SPluE.  The  individual  cells  have  been 
drawn  and  the  composite  digitized  by  the  mask  maker.  Because  of  high 
demand  in  the  IC  market,  the  turnaround  at  the  mask  maker  has  been  some¬ 
what  longer  than  originally  anticipated.  In  addition  to  this  delay, 
the  initial  mask  set  received  was  incomplete;  also,  some  of  the  masks 
were  reversed  field.  The  net  effect  of  these  two  delays  is  an  anticipated 
schedule  slippage  of  several  months.  We  have  only  recently  completed 
the  processing  required  prior  to  short  testing  the  devices,  dicing  the 
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wafer,  and  packaging.  However,  this  process  is  now  complete,  and  we 
have  started  the  initial  performance  evaluation.  Because  of  the  larger 
kernel  size  and  the  significantly  different  characteristics  of  these 
circuits,  we  essentially  have  had  to  rebuild  the  test  facility.  Consider¬ 
able  time  and  effort  was  expended  on  this  during  this  period,  as  discussed 
below. 

III.  PERFORMANCE,  EVALUATION,  AND  TESTING 

To  be  able  to  effectively  test  the  performance  of  the  microelectronic 

circuits  developed,  it  has  been  necessary  to  develop  an  appropriate  test 

facility.  The  essential  components  of  this  system  are  shown  in  Figure  8. 

The  functions  developed  on  the  test  chip  include  both  two-phase  and 

single-phase  circuits,  each  requiring  a  clock  swing  of  at  least  20  V. 

To  achieve  this,  we  have  developed  a  driven  system  that  operates  from  an 
2 

8-channel  (T  L  voltage  level)  word  generator  and  uses  these  outputs  to 
develop  MOS  level  waveforms.  This  generator  is  also  used  to  develop  the 
necessary  diffusion  and  reset  pulses.  The  word  generator  can  be  clocked 
at  a  20-MHz  bit  rate  to  process  a  3-MHz  bandwidth  signal.  The  imagery 
will  be  obtained  either  from  a  stored  data  base  or  a  vidicon  camera. 

In  either  case,  the  data  requires  formatting  into  several  parallel  video 
lines  to  form  the  appropriate  kernel  size.  The  system  to  do  this  has 
now  been  completed.  It  consists  of  a  10-MHz,  8-bit  analog-to-digital 
converter  system  that  provides  input  to  a  24-kbit  RAM  register. 

The  RAM  register  provides  a  delay  equivalent  to  six  horizontal  lines. 

A  digital-to-analog  converter  is  included  after  each  4  kbits  to  provide 
the  analog  output  from  the  adjacent  lines.  The  necessary  hardware  to 
provide  this  facility  has  now  been  completed  and  the  system  tested  with 
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Figure  8.  Schematic  of  test  facility. 
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a  commercial  vidicon.  In  addition,  a  signal  conditioner  box,  which  both 
translates  the  dc  level  of  the  resulting  video  data  and  can  provide  the 
necessary  variable  gain,  has  been  designed  and  built.  We  also  have 
provided  the  capability  to  vary  both  the  spatial  and  temporal  resolution 
of  the  processor  and  have  investigated  several  commercial  "frame  grabbers" 
and  digital  memories  to  provide  this  facility.  The  system  we  have  built 
is  based  on  the  Quantex  Field  Grabber  with  resolution  of  256x256  pixels 
each  with  6  bits  of  gray  scale.  We  have  interfaced  this  system  to  an 
IMSAI  8080  microcomputer  for  evaluation  and  have  written  several  software 
packages  to  manipulate  the  data  to  provide  both  simulation  of  image¬ 
understanding  operations  and  manipulation  for  display  purposes. 

We  are  currently  using  this  facility  to  evaluate  the  first  lot 
of  test  wafers.  We  have  decided  to  investigate  the  5x5  data  programmable 
convolution  and  the  median  operator  initially.  The  schematic  of  the  5x5 
programmable  operator  is  shown  in  Figure  9.  Essentially,  it  accepts 
five  parallel  lines  of  video  data  and  performs  a  25-point  bipolar 
convolution  on  a  sliding  5x5  pixel  array.  The  mathematical  formulation 
of  the  processor  is  given  by 


/\ 

where  I  is  the  intensity  of  the  processed  image,  I  is  the  original  image,  and 
the  W^j  are  the  programmable  weights.  The  processor  consists  of  a  two- 
dimensional  floating  gate  array  with  25  voltage-controlled  taps.  This 
array  overlays  five  separate  CCD  delay  lines  through  which  charge 
equivalent  to  each  picture  intensity  is  clocked.  A  single  source  follower 
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at  the  end  of  all  of  the  delay  lines  is  used  to  detect  the  linearity  and 
transfer  efficiency  of  the  basic  CCD  structure.  An  example  of  the 
operation  in  this  mode  is  given  in  Figure  10.  The  input  signal  is  shown 
on  the  upper  trace,  and  the  output  resulting  from  two  cycles  of  this  wave¬ 
form  is  shown  on  the  lower  trace.  The  operator  here  is  at  '''6  kHz, 
equivalent  to  the  pixel  rate  of  the  stored  data  base  microprocessor 
system.  The  photograph  illustrates  the  need  for  an  output  stage  including 
a  sample  and  hold.  The  reset  pulses  occurring  each  pixel  interval  can 
be  seen  to  have  about  the  same  magnitude  as  the  diffusion  output.  These 
are  due  entirely  to  f eedthrough  and  are  unwanted  outputs.  To  avoid  this, 
we  have  tested  the  on-chip  sample  and  hold  as  shown  in  Figure  11.  In  this 
case,  the  input  signal  consists  of  a  single  frequency  that  is  sampled 
each  60  psec.  At  these  intervals,  the  output  level  of  the  waveform  is 
sampled  and  frozen  until  the  next  sample  pulse.  This  circuit  is  included 
on  all  the  active  CCD  outputs  and  is  used  to  eliminate  the  unwanted  clock 
to  reset  feedthroughs.  The  initial  results  obtained  from  the  floating 
gate  are  shown  in  Figure  12(a  and  b) .  In  each  case,  the  input  signal  is 
shown  on  the  upper  trace  and  the  resulting  impulse  response  shown  below. 

In  Figure  12(a)  the  voltage  settings  on  the  floating  arrays  are  equivalent 
to  0,  -1,  0,  -1,  0,  and  in  Figure  12(b)  to  0,  1,  0,  -1,  0.  In  each  case, 
the  input  response  equivalent  to  this  setting  is  obtained,  verifying  both 
the  programmabi  1  i ty  and  the  capability  of  bipolar  weighting.  Figure  12(,e) 
shows  the  impulse  response  equivalent  to  1 ,  1,  1,  1,  1.  Although  these 
are  preliminary  results,  it  is  clear  that  all  the  necessary  functions  are 
operating  on  the  chip.  The  dynamic  range  of  the  device  obtained  so  far 
is  illustrated  in  Figure  13.  It  is  equivalent  to  about  14  gray  levels. 


Figure  10.  Output  from  source— follower  on  the  3x5  programmable  processor. 


9136-8 


INPUT 

WAVEFORM 


OUTPUT 

WAVEFORM 


Figure  11.  Operation  of  the  on-chip  sample  and  hold. 
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Figure  13.  Linearity  and  dynamic  range  of  Sxb  pro¬ 
grammable  processor. 
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The  granularity  on  the  trace  shown  is  due  in  part  to  the  microprocessor 
system  used  to  obtain  this  transfer  characteristic.  Our  work  will  continue 
on  this  circuit  to  improve  both  the  accuracy  and  dynamic  range  and  to 
increase  the  speed  to  7.5  MHz.  We  have,  however,  started  to  process  some 
test-patterns,  as  shown  in  Figure  14 .  The  first  consists  of  a  two- 
dimensional  grid  with  lines  each  a  single  pixel  in  width.  This  has  been 
processed  with  the  processor  set  for  a  1,  1,  1,  1,  1  impulse  response  in 
both  x  and  y.  The  resulting  grid  with  each  line  5  pixels  wide  is  shown 
in  Figure  14(b).  This  corresponds  directly  to  the  simulation  shown  in 
Figure  14(c).  A  similar  result  for  the  picture  of  the  house  is  shown 
in  Figure  15.  The  impulse  response  for  the  programmable  array  setup  to 
be  both  a  plus  shape  and  a  cross  shape  are  shown  in  Figure  lb. 

During  the  next  phase  of  the  program,  we  will  continue  the  testing 
and  evaluation  of  t  'o  circuits  and  provide  a  detailed  performance 
review. 

IV.  STUDY  OF  VLSI  FOR  IMAGE  UNDERSTANDING 

In  addition  to  the  work  on  the  CCD/MOS  circuitry  and  building  a 
new  test  facility,  we  have  begun  an  analysis  program  to  determine  the 
impact  and  advantages  ot  verv  large  scale  integration  (VLSI)  on  image 
understanding.  Tlu  on-going  work  in  software  and  algorithm  development 
clearly  is  leading  to  an  ever- increasing  demand  for  computat ional 
throughput.  Further,  it  is  evident  that,  even  with  the  most  sophisticated 
general-purpose  machines,  the  processing  times  are  incompatible  with  any 
real-time  application.  An  apparent  solution  to  these  issues  is  the 
development  of  new  processing  architectures  based  on  the  latest 
technolog ies .  Two  significant  developments  have  been  taking  place  in 
microelectronics  in  the  past  several  years.  The  first  is  the  development 
of  a  variety  of  new  technolgoies  such  as  DMOS,  CMOS/SOS,  1~L,  KCL,  and 
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GaAs.  Kach  offers  a  different  combination  of  parameters  In  terms  of 
speed,  power  consumption,  and  the  possible  level  of  integration.  It  Is 
of  particular  importance  to  the  I. II,  program  to  be  able  to  evaluate 
the  advantages  and  constraints  of  all  the  available  technologies  with 
respect  to  our  programs.  In  addition,  with  increased  resolution  of 
Integrated -clrcul t  features  and  decreased  power  consumption,  the  level 
of  integration  within  each  chip  has  greatly  increased.  For  example, 
in  highly  regular  arrays,  such  as  memories,  as  many  as  in'1  to  10*’ 
functions  can  be  integrated  in  each  chip.  This  development  will  also 
have  profound  effects. 

To  some  extent,  the  design  and  architecture  of  high-density 
circuitry  other  than  memories  and  commercial  microprocessors  have  lagged 
behind  these  developments,  and  a  significant  advance  can  be  expected 
from  the  optimum  design  of  linage-processing  architectures.  To  achieve 
the  maximum  advantages  of  Vl.Sl  for  the  I.U.  problem,  two  precepts  must  be 
adhered  to.  First,  t  he  designs  and  architectures  must,  where  possible 
provide  concurrent  operation.  The  obvious  bottlenecks  that  result  in 
the  von  Neuman  concepts  of  a  single  arithmetic  unit,  etc.  can  be  cir¬ 
cumvented  by  highly  localized  operation  with  multiple  primitives.  This 
removes  many  of  the  problems  from  the  processor  and  places  them  in  the 
control  and  data  distribution  system.  There  is,  for  example,  a  need  for 
localized  distributed  storage  and  memory  associated  with  each  primitive 
for  data  and  instruction  queuing.  In  addition,  both  for  ease  of  design 
and  for  Increased  packing  density,  the  circuitry  mint  be  highly  regular 
on  the  silicon  surface. 
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From  our  previous  work  on  this  program,  it  is  clear  that,  using 
current  state-of-the-art  technology,  low  to  intermediate  level  primitives 
can  be  built  that  will  provide  real-time  operation  without  requiring 

extensive  area  on  the  silicon.  From  this  work,  we  anticipate  that  five 

2 

or  six  primitives  might  be  included  in  a  single  200-mil  chip.  In 
support  of  this  concept,  we  are  currently  investigating  concepts  for 
data  distribution  and  intelligent  local  storage  as  well  as  techniques  such 
as  residue  operation  and  number  theoretic  transformation  for  regularizing 
the  processes  themselves.  The  eventual  aim  of  this  work,  which  will 
continue  in  the  next  period,  is  to  determine  an  optimum  way  of  mapping 
the  algorithms  and  processors  onto  the  two-dimensional  silicon. 

V.  SUMMARY  AND  FUTURE  PLANS 

During  t he  current  period,  our  work  has  been  concentrated  in  three 
areas:  the  fabrication  of  Test  Chip  Ill,  the  development  of  an 

effective  test  facility  and  preliminary  testing  of  the  circuits,  and  an 
initial  study  of  the  effect  of  VLSI  on  image  understanding.  Progress  in 
each  of  these  areas  has  been  satisfactory,  although  unexpected  delays 
have  been  encountered  primilarly  with  the  outside  mask  maker.  The 
problems  with  this  vendor  have  created  an  unavoidable  delay  of  at  least 
two  months  and  have  been  largely  responsible  for  our  delay  in  the  testing 
schedule.  However,  all  11  masks  have  now  been  received,  and  the  wafers 
have  been  processed  and  short  tested.  We  have  tompleted  nearly  all  the 
work  on  the  test  facility  and  are  now  getting  Initial  test  results. 

This  work  will  continue  next  period,  and  we  intend  lo  Issue  ac  Interim 
report  detailing  the  tost  results  as  available.  In  addition  to  this,  we 
have  started  our  VLSI  study  and  arc  starting  to  formulate  an  approach 


14  4 


for  this  major  t.isk  next  year. 
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